Generation and Validation of Empirically

نویسندگان

  • Ketan Mayer-Patel
  • Steve Marron
  • Jan Prins
  • ELIX HERN
چکیده

F ELIX HERN ANDEZ-CAMPOS: Generation and Validation of Empiri ally-Derived TCP Appli ation Workloads. (Under the dire tion of Kevin Je ay) This dissertation proposes and evaluates a new approa h for generating realisti traÆ in networking experiments. The main problem solved by our approa h is generating losedloop traÆ onsistent with the behavior of the entire set of appli ations in modern traÆ mixes. Unlike earlier approa hes, whi h des ribed individual appli ations in terms of the spe i semanti s of ea h appli ation, we des ribe the sour e behavior driving ea h onne tion in a generi manner using the a-b-t model. This model provides an intuitive but detailed way of des ribing sour e behavior in terms of onne tion ve tors that apture the sizes and ordering of appli ation data units, the quiet times between them, and whether data ex hange is sequential or on urrent. This is onsistent with the view of traÆ from TCP, whi h does not on ern itself with appli ation semanti s. The a-b-t model also satis es a ru ial property: given a pa ket header tra e olle ted from an arbitrary Internet link, we an algorithmi ally infer the sour e-level behavior driving ea h onne tion, and ast it into the notation of the model. The result of pa ket header pro essing is a olle tion of a-b-t onne tion ve tors, whi h an then be replayed in software simulators and testbed experiments to drive network sta ks. Su h a replay generates syntheti traÆ that fully preserves the feedba k loop between the TCP endpoints and the state of the network, whi h is essential in experiments where network ongestion an o ur. By onstru tion, this type of traÆ generation is fully reprodu ible, providing a solid foundation for omparative empiri al studies. Our experimental work demonstrates the high quality of the generated traÆ , by dire tly omparing tra es from real Internet links and their sour e-level tra e replays for a ri h set of iii metri s. Su h omparison requires the areful measurement of network parameters for ea h onne tion, and their reprodu tion together with the orresponding sour e behavior. Our nal ontribution onsists of two resampling methods for introdu ing ontrolled variability in network experiments and for generating losed-loop traÆ that a urately mat hes a target o ered load. iv ACKNOWLEDGMENTS First of all, I must thank Kevin Je ay and Don Smith for their guidan e and en ouragement throughout my do toral program. Their patien e and friendship have been invaluable all these years. I also thank them, together with other fa ulty and student members of the Distributed and Real-Time Systems group (DiRT), for building a phenomenal infrastru ture for Internet measurement and experimental networking resear h. DiRT students have greatly ontributed to my do toral experien e, most espe ially Jay Aikat and David Ott. My ommittee members and other ollaborators have ontributed tremendously to my efforts. I am spe ially in debt with Steve Marron and Andrew Nobel, who have greatly enri hed the statisti al side of my work. In this regard, being part of SAMSI's \Network Modeling for the Internet" program and of the inter-dis iplinary Internet study group at UNC gave me superb opportunities to widen my understanding of Internet resear h. I must also thank UNC's Department of Computer S ien e as whole, in luding fa ulty, students and sta , for reating an outstanding resear h and tea hing environment. Overall, my years at UNC were an in redible positive experien e. I thank the National S ien e Foundation, IBM, Cis o, Intel, Sun Mi rosystems and others for supporting this work. I am spe ially grateful to the Computer Measurement Group (CMG) for their do toral fellowship. Finally, I thank my family for their support. Their onstant example of hard-work, and their respe t for intelle tual endeavors has motivated me during my entire life. My wife's help with the editing of this manus ript was invaluable, as was her onstant en ouragement during my graduate studies. More than anybody else, my parents gave me my passion for knowledge, and it is to them that I dedi ate this do toral dissertation. v TABLE OF CONTENTS LIST OF TABLES x LIST OF FIGURES xi LIST OF ABBREVIATIONS xxiii 1 Introdu tion 1 1.1 Abstra t Sour e-Level Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 Sour e-Level Tra e Replay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.3 Tra e Resampling and Load S aling . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.4 Thesis Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.5 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 1.6 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2 Related Work 22 2.1 Pa ket-Level TraÆ Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.2 Sour e-Level TraÆ Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.2.1 Web TraÆ Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.2.2 Non-Web TraÆ Sour e-level Modeling . . . . . . . . . . . . . . . . . . . 35 2.2.3 Beyond Single Appli ation Modeling . . . . . . . . . . . . . . . . . . . . . 38 2.3 S aling O ered Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.4 Implementing TraÆ Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3 Abstra t Sour e-level Modeling 45 3.1 The Sequential a-b-t Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 vi 3.1.1 Client/Server Appli ations . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.1.2 Beyond Client/Server Appli ations . . . . . . . . . . . . . . . . . . . . . . 57 3.2 The Con urrent a-b-t Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.3 Abstra t Sour e-Level Measurement . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.3.1 From TCP Sequen e Numbers to Appli ation Data Units . . . . . . . . . 63 3.3.2 Logi al Order of Data Segments . . . . . . . . . . . . . . . . . . . . . . . 67 3.3.3 Data Analysis Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 3.4 Validation using Syntheti Appli ations . . . . . . . . . . . . . . . . . . . . . . . 77 3.5 Analysis Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 3.5.1 Variability A ross Sites . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 3.5.2 Time-of-Day Variability and Workload Dire tionality . . . . . . . . . . . . 95 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 4 Network-Level Parameters and Metri s 104 4.1 Network-level Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 4.1.1 Round-Trip Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 4.1.2 Re eiver Window Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 4.1.3 Loss Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 4.2 Network-level Metri s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 4.2.1 Aggregate Throughput Time Series . . . . . . . . . . . . . . . . . . . . . . 130 4.2.2 Throughput Marginals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 4.2.3 Throughput Self-Similarity and Long-Range Dependen e . . . . . . . . . 149 4.2.4 Time Series of A tive Conne tions . . . . . . . . . . . . . . . . . . . . . . 157 4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 5 Generating TraÆ 165 5.1 Replaying Tra es at the Sour e-Level . . . . . . . . . . . . . . . . . . . . . . . . . 165 5.1.1 Tra e Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 5.1.2 Condu ting Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 5.1.3 Data Colle tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 vii 5.2 Validation of Sour e-level Tra e Replay . . . . . . . . . . . . . . . . . . . . . . . 173 5.2.1 Leipzig-II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 5.2.2 UNC 1 PM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 5.2.3 Abilene-I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 5.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 6 Reprodu ing TraÆ 195 6.1 Beyond Comparing Conne tion Ve tors . . . . . . . . . . . . . . . . . . . . . . . 196 6.2 Sour e-level Replay of Leipzig-II . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 6.2.1 Time Series of Byte Throughput . . . . . . . . . . . . . . . . . . . . . . . 200 6.2.2 Time Series of Pa ket Throughput . . . . . . . . . . . . . . . . . . . . . . 203 6.2.3 Marginal Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 6.2.4 Long-Range Dependen e . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 6.2.5 Time Series of A tive Conne tions . . . . . . . . . . . . . . . . . . . . . . 214 6.3 Sour e-level Replay of UNC 1 PM . . . . . . . . . . . . . . . . . . . . . . . . . . 217 6.3.1 Time Series of Byte Throughput . . . . . . . . . . . . . . . . . . . . . . . 217 6.3.2 Time Series of Pa ket Throughput . . . . . . . . . . . . . . . . . . . . . . 220 6.3.3 Marginal Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 6.3.4 Long-Range Dependen e . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 6.3.5 Time Series of A tive Conne tions . . . . . . . . . . . . . . . . . . . . . . 229 6.4 Mid-Chapter Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 6.4.1 Observations on Byte Throughput . . . . . . . . . . . . . . . . . . . . . . 230 6.4.2 Observations on Pa ket Throughput . . . . . . . . . . . . . . . . . . . . . 232 6.4.3 Observations on A tive Conne tions . . . . . . . . . . . . . . . . . . . . . 233 6.5 Sour e-level Replay of UNC 1 AM . . . . . . . . . . . . . . . . . . . . . . . . . . 235 6.5.1 Time Series of Byte Throughput . . . . . . . . . . . . . . . . . . . . . . . 235 6.5.2 Time Series of Pa ket Throughput . . . . . . . . . . . . . . . . . . . . . . 235 6.5.3 Marginal Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 6.5.4 Long-Range Dependen e . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 viii 6.5.5 Time Series of A tive Conne tions . . . . . . . . . . . . . . . . . . . . . . 242 6.6 Sour e-level Replay of UNC 7:30 PM . . . . . . . . . . . . . . . . . . . . . . . . . 243 6.6.1 Time Series of Byte Throughput . . . . . . . . . . . . . . . . . . . . . . . 243 6.6.2 Time Series of Pa ket Throughput . . . . . . . . . . . . . . . . . . . . . . 244 6.6.3 Marginal Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 6.6.4 Long-Range Dependen e . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 6.6.5 Time Series of A tive Conne tions . . . . . . . . . . . . . . . . . . . . . . 251 6.7 Sour e-level Replay of Abilene-I . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 6.7.1 Time Series of Byte Throughput . . . . . . . . . . . . . . . . . . . . . . . 252 6.7.2 Time Series of Pa ket Throughput . . . . . . . . . . . . . . . . . . . . . . 253 6.7.3 Marginal Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 6.7.4 Long-Range Dependen e . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 6.7.5 Time Series of A tive Conne tions . . . . . . . . . . . . . . . . . . . . . . 259 6.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260 7 Tra e Resampling and Load S aling 261 7.1 Poisson Resampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 7.1.1 Basi Poisson Resampling . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 7.1.2 Byte-Driven Poisson Resampling . . . . . . . . . . . . . . . . . . . . . . . 271 7.2 Blo k Resampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 7.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 8 Con lusions and Future Work 288 8.1 Empiri al Modeling of TraÆ Mixes . . . . . . . . . . . . . . . . . . . . . . . . . 289 8.2 Re ning and Extending our Modeling . . . . . . . . . . . . . . . . . . . . . . . . 291 8.3 Assessing Realism in Syntheti TraÆ . . . . . . . . . . . . . . . . . . . . . . . . 294 8.4 In orporating Additional Network-Level Parameter . . . . . . . . . . . . . . . . . 296 8.5 Flexible TraÆ Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 BIBLIOGRAPHY 300 ix LIST OF TABLES 3.1 Breakdown of the TCP onne tions found in ve tra es. . . . . . . . . . . . . . . 82 4.1 Estimated Hurst parameters and their on den e intervals for the pa ket throughput time series of ve tra es. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 4.2 Estimated Hurst parameters and their on den e intervals for the byte throughput time series of ve tra es. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 6.1 Estimated Hurst parameters and their on den e intervals for the byte throughput time series of Leipzig-II and its four types of sour e-level tra e replay. . . . . 212 6.2 Estimated Hurst parameters and their on den e intervals for the pa ket throughput time series of Leipzig-II and its four types of sour e-level tra e replay. . . . . 215 6.3 Estimated Hurst parameters and their on den e intervals for the byte throughput time series of UNC 1 PM and its four types of sour e-level tra e replay. . . . 225 6.4 Estimated Hurst parameters and their on den e intervals for the pa ket throughput time series of UNC 1 PM and its four types of sour e-level tra e replay. . . . 228 6.5 Estimated Hurst parameters and their on den e intervals for the byte throughput time series of UNC 1 AM and its four types of sour e-level tra e replay. . . . 240 6.6 Estimated Hurst parameters and their on den e intervals for the pa ket throughput time series of UNC 1 AM and its four types of sour e-level tra e replay. . . . 241 6.7 Estimated Hurst parameters and their on den e intervals for the byte throughput time series of UNC 7:30 PM and its four types of sour e-level tra e replay. . 248 6.8 Estimated Hurst parameters and their on den e intervals for the pa ket throughput time series of UNC 7:30 PM and its four types of sour e-level tra e replay. . 250 6.9 Estimated Hurst parameters and their on den e intervals for the byte throughput time series of Abilene-I and its four types of sour e-level tra e replay. . . . . 257 6.10 Estimated Hurst parameters and their on den e intervals for the pa ket throughput time series of Abilene-I and its four types of sour e-level tra e replay. . . . . 258 7.1 Estimated Hurst parameters and their on den e intervals for the onne tion arrival time series of UNC 1 PM and UNC 1 AM, and their Poisson arrival ts. . 275 7.2 Estimated Hurst parameters and their on den e intervals for ve subsamplings obtained from the onne tion arrival time series of UNC 1 PM and UNC 1 AM . 284 x LIST OF FIGURES 1.1 Network traÆ seen from di erent levels. . . . . . . . . . . . . . . . . . . . . . . 4 1.2 An a-b-t diagram illustrating a persistent HTTP onne tion. . . . . . . . . . . . 8 1.3 A diagram illustrating the intera tion between two BitTorrent peers. . . . . . . . 10 1.4 Overview of Sour e-level Tra e Replay. . . . . . . . . . . . . . . . . . . . . . . . . 12 3.1 An a-b-t diagram representing a typi al ADU ex hange in HTTP version 1.0. . . 48 3.2 An a-b-t diagram illustrating a persistent HTTP onne tion. . . . . . . . . . . . 49 3.3 An a-b-t diagram illustrating an SMTP onne tion. . . . . . . . . . . . . . . . . . 53 3.4 Three a-b-t diagrams representing three di erent types of NNTP intera tions. . . 54 3.5 An a-b-t diagram illustrating a server push from a web am using a persistent HTTP onne tion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.6 An a-b-t diagram illustrating I e ast audio streaming in a TCP onne tion. . . . 58 3.7 Three a-b-t diagrams of onne tions taking part in the intera tion between an FTP lient and an FTP server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.8 An a-b-t diagram illustrating an NNTP onne tion in \stream-mode", whi h exhibits data ex hange on urren y. . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.9 An a-b-t diagram illustrating the intera tion between two BitTorrent peers. . . . 60 3.10 A rst set of TCP segments for the onne tion ve tor in Figure 3.1: lossless example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.11 A se ond set of TCP segments for the onne tion ve tor in Figure 3.1: lossy example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.12 Distributions of ADU sizes for the testbed experiments with syntheti appli ations. 79 3.13 Distributions of quiet time durations for the testbed experiments with syntheti appli ations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 3.14 Distributions of ADU sizes for the testbed experiments with syntheti appli ations. 81 3.15 Distributions of quiet time durations for the testbed experiments with syntheti appli ations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.16 Bodies of the A and B distributions for Abilene-I, Leipzig-II and UNC 1 PM. . . 86 xi 3.17 Tails of the A and B distributions for Abilene-I, Leipzig-II and UNC 1 PM. . . . 86 3.18 Bodies of the A and B distributions with per-byte probabilities for Abilene-I, Leipzig-II and UNC 1 PM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 3.19 Bodies of the E distributions for Abilene-I, Leipzig-II and UNC 1 PM. . . . . . . 88 3.20 Bodies of the E distributions with per-byte probabilities for Abilene-I, Leipzig-II and UNC 1 PM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 3.21 Tails of the E distributions for Abilene-I, Leipzig-II and UNC 1 PM. . . . . . . . 89 3.22 Average size of the epo hs in ea h onne tion ve tor as a fun tion of the number of epo hs for Abilene-I, Leipzig-II and UNC 1 PM. . . . . . . . . . . . . . . . . . 90 3.23 Average of the median size of the ADUs in ea h onne tion ve tor as a fun tion of the number of epo hs for Abilene-I, Leipzig-II and UNC 1 PM. . . . . . . . . . 90 3.24 Average of the median size of the ADUs in ea h onne tion ve tor as a fun tion of the number of epo hs, for Leipzig-II. . . . . . . . . . . . . . . . . . . . . . . . . 91 3.25 Average of the median size of the ADUs in ea h onne tion ve tor as a fun tion of the number of epo hs for Abilene-I. . . . . . . . . . . . . . . . . . . . . . . . . 91 3.26 Bodies of the TA and TB distributions for Abilene-I, Leipzig-II and UNC 1 PM. 92 3.27 Tails of the TA and TB distributions for Abilene-I, Leipzig-II and UNC 1 PM. . 92 3.28 Distribution of the durations of the quiet times between the nal ADU and onne tion termination. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 3.29 Bodies of the A and B distributions for the on urrent onne tions in Abilene-I, Leipzig-II and UNC 1 PM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 3.30 Tails of the A and B distributions for the on urrent onne tions in Abilene-I, Leipzig-II and UNC 1 PM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 3.31 Bodies of the TA and TB distributions for the on urrent onne tions in AbileneI, Leipzig-II and UNC 1 PM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 3.32 Tails of the TA and TB distributions for the on urrent onne tions in Abilene-I, Leipzig-II and UNC 1 PM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 3.33 Bodies of the A distributions for UNC 1 AM, UNC 1 PM and UNC 7:30 PM. . . 96 3.34 Bodies of the B distributions for UNC 1 AM, UNC 1 PM and UNC 7:30 PM. . . 96 3.35 Bodies of the TB distributions for UNC 1 AM, UNC 1 PM and UNC 7:30 PM. . 97 3.36 Tails of the TB distributions for UNC 1 AM, UNC 1 PM and UNC 7:30 PM. . . 97 3.37 Bodies of the TA distributions for three UNC tra es. . . . . . . . . . . . . . . . . 98 xii 3.38 Tails of the TA distributions for three UNC tra es. . . . . . . . . . . . . . . . . . 98 4.1 A set of TCP segments illustrating RTT estimation from onne tion establishment.109 4.2 Two sets of TCP segments illustrating RTT estimation ambiguities in the presen e of loss and early retransmission in onne tion establishment. . . . . . . . . . 110 4.3 A set of TCP segments illustrating RTT estimation using the sum of two OSTTs.111 4.4 A set of TCP segments illustrating the impa t of delayed a knowledgments on OSTTs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 4.5 Comparison of RTT estimators for a syntheti tra e: no loss and enabled delayed a knowledgments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 4.6 Comparison of RTT estimators for a syntheti tra e: no loss and disabled delayed a knowledgments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 4.7 Comparison of RTT estimators for a syntheti tra e: xed loss rate of 1% for all onne tions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 4.8 Comparison of RTT estimators for a syntheti tra e: loss rates uniformly distributed between 0% and 10%. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 4.9 A set of TCP segments illustrating an invalid OSTT sample due to the intera tion between loss and umulative a knowledgments. . . . . . . . . . . . . . . . . . . . 118 4.10 Comparison of RTT estimators for a syntheti tra e: loss rates uniformly distributed between 0% and 10%. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 4.11 Comparison of RTT estimators for syntheti tra es: xed loss rate of 1%; real RTTs up to 4 se onds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 4.12 Bodies of the RTT distributions for the ve tra es. . . . . . . . . . . . . . . . . . 120 4.13 Bodies of the RTT distributions with per-byte probabilities for the ve tra es. . . 120 4.14 Comparison of the sum-of-minima and sum-of-medians RTT estimators for UNC 1 PM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 4.15 Comparison of the sum-of-minima and sum-of-medians RTT estimators for LeipzigII. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 4.16 Bodies of the distributions of maximum re eiver window sizes for the ve tra es. 123 4.17 Bodies of the distributions of maximum re eiver window sizes with per-byte probabilities for the ve tra es. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 4.18 Measured loss rates from experiments with 1% loss rates applied only on one dire tion or on both dire tions of the TCP onne tions. . . . . . . . . . . . . . . 126 4.19 Bodies of the distributions of loss rates for the ve tra es. . . . . . . . . . . . . 129 xiii 4.20 Bodies of the distributions of loss rates with per-byte probabilities for the ve tra es. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 4.21 Breakdown of the byte throughput time series for Leipzig-II inbound. . . . . . . 131 4.22 Breakdown of the pa ket throughput time series for Leipzig-II inbound. . . . . . 131 4.23 Breakdown of the byte throughput time series for Leipzig-II outbound. . . . . . 133 4.24 Breakdown of the pa ket throughput time series for Leipzig-II outbound. . . . . 133 4.25 Breakdown of the byte throughput time series for Leipzig-II outbound. . . . . . 134 4.26 Breakdown of the pa ket throughput time series for Leipzig-II outbound. . . . . 134 4.27 Breakdown of the byte throughput time series for Abilene-I Ipls/Clev. . . . . . . 135 4.28 Breakdown of the pa ket throughput time series for Abilene-I Ipls/Clev. . . . . 135 4.29 Breakdown of the byte throughput time series for Abilene-I Clev/Ipls. . . . . . . 137 4.30 Breakdown of the pa ket throughput time series for Abilene-I Clev/Ipls. . . . . 137 4.31 Breakdown of the byte throughput time series for UNC 1 PM inbound. . . . . . 138 4.32 Breakdown of the pa ket throughput time series for UNC 1 PM inbound. . . . . 138 4.33 Breakdown of the byte throughput time series for UNC 1 PM outbound. . . . . 138 4.34 Breakdown of the pa ket throughput time series for UNC 1 PM outbound. . . . 138 4.35 Breakdown of the byte throughput time series for the three UNC tra es. . . . . 140 4.36 Breakdown of the pa ket throughput time series for the three UNC tra es. . . . 140 4.37 Byte throughput marginals of Leipzig-II inbound, its normal distribution t, the marginal distribution of its Poisson arrival t, and the normal distribution t of this Poisson arrival t. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 4.38 Pa ket throughput marginals of Leipzig-II inbound, its normal distribution t, the marginal distribution of its Poisson arrival t, and the normal distribution t of this Poisson arrival t. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 4.39 Byte throughput marginals of UNC 1 PM outbound, its normal distribution t, the marginal distribution of its Poisson arrival t, and the normal distribution t of this Poisson arrival t. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 4.40 Pa ket throughput marginals of UNC 1 PM outbound, its normal distribution t, the marginal distribution of its Poisson arrival t, and the normal distribution t of this Poisson arrival t. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 xiv 4.41 Quantile-quantile plots with simulation envelops for the marginal distribution of Leipzig-II inbound. The top four plots show byte throughput, while the four bottom plots show pa ket throughput. . . . . . . . . . . . . . . . . . . . . . . . 146 4.42 Quantile-quantile plots with simulation envelops for the marginal distribution of UNC 1 PM outbound. The top four plots show byte throughput, while the four bottom plots show pa ket throughput. . . . . . . . . . . . . . . . . . . . . . . . 147 4.43 Wavelet spe tra of the pa ket throughput time series for Leipzig-II inbound and its Poisson arrival t. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 4.44 Wavelet spe tra of the byte throughput time series for Leipzig-II inbound and its Poisson arrival t. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 4.45 Wavelet spe tra of the pa ket throughput time series for Abilene-I. . . . . . . . 154 4.46 Wavelet spe tra of the byte throughput time series for Abilene-I. . . . . . . . . . 154 4.47 Wavelet spe tra of the pa ket throughput time series for UNC 1 PM. . . . . . . 155 4.48 Wavelet spe tra of the byte throughput time series for UNC 1 PM. . . . . . . . 155 4.49 Breakdown of the a tive onne tions time series for Leipzig-II. . . . . . . . . . . 157 4.50 Impa t of the de nition of a tive onne tion on Leipzig-II. . . . . . . . . . . . . 157 4.51 Breakdown of the a tive onne tions time series for Abilene-I. . . . . . . . . . . 158 4.52 Impa t of the de nition of a tive onne tion on Abilene-I. . . . . . . . . . . . . . 158 4.53 Breakdown of a tive onne tions time series for UNC 1 PM using both de nitions of a tive onne tion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 4.54 Impa t of the time-of-day on the a tive onne tions time series for the three UNC tra es. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 5.1 Overview of Sour e-level Tra e Replay. . . . . . . . . . . . . . . . . . . . . . . . . 166 5.2 Diagram of the network testbed where the experiments of this dissertation were ondu ted. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 5.3 End-host ar hite ture of the traÆ generation system. . . . . . . . . . . . . . . . 169 5.4 Bodies and tails of the A distributions for Leipzig-II and its sour e-level tra e replays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 5.5 Bodies and tails of the B distributions for Leipzig-II and its sour e-level tra e replays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 5.6 Bodies and tails of the E distributions for Leipzig-II and its sour e-level tra e replays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 xv 5.7 Bodies and tails of the TA distributions for Leipzig-II and its sour e-level tra e replays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 5.8 Bodies and tails of the TB distributions for Leipzig-II and its sour e-level tra e replays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 5.9 Bodies of the round-trip time and re eiver window size distributions for Leipzig-II and its sour e-level tra e replays. . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 5.10 Bodies the loss rate distributions for Leipzig-II and its sour e-level tra e replays, with probabilities omputed per onne tion (left) and per byte (right). . . . . . . 181 5.11 Bodies and tails of the A distributions for UNC 1 PM and its sour e-level tra e replays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 5.12 Bodies and tails of the B distributions for UNC 1 PM and its sour e-level tra e replays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 5.13 Bodies and tails of the E distributions for UNC 1 PM and its sour e-level tra e replays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 5.14 Bodies and tails of the TA distributions for UNC 1 PM and its sour e-level tra e replays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 5.15 Bodies and tails of the TB distributions for UNC 1 PM and its sour e-level tra e replays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 5.16 Bodies of the round-trip time and re eiver window size distributions for UNC 1 PM and its sour e-level tra e replays. . . . . . . . . . . . . . . . . . . . . . . . . 186 5.17 Bodies of the loss rate distributions for UNC 1 PM and its sour e-level tra e replays, with probabilities omputed per onne tion (left) and per byte (right). . 187 5.18 Bodies and tails of the A distributions for Abilene-I and its sour e-level tra e replays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 5.19 Bodies and tails of the B distributions for Abilene-I and its sour e-level tra e replays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 5.20 Bodies and tails of the E distributions for Abilene-I and its sour e-level tra e replays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188 5.21 Bodies and tails of the TA distributions for Abilene-I and its sour e-level tra e replays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 5.22 Bodies and tails of the TB distributions for Abilene-I and its sour e-level tra e replays. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 5.23 Bodies of the round-trip time and re eiver window size distributions for Abilene-I and its sour e-level tra e replays. . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 xvi 5.24 Bodies of the loss rate distributions for Abilene-I and its sour e-level tra e replays, with probabilities omputed per onne tion (left) and per byte (right). . . 191 6.1 Byte throughput time series for Leipzig-II inbound and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 6.2 Byte throughput time series for Leipzig-II outbound and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 6.3 Pa ket throughput time series for Leipzig-II inbound and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 6.4 Pa ket throughput time series for Leipzig-II outbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 6.5 Byte throughput marginals for Leipzig-II inbound and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 6.6 Byte throughput marginals for Leipzig-II outbound and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 6.7 Pa ket throughput marginals for Leipzig-II inbound and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 6.8 Pa ket throughput marginals for Leipzig-II outbound and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 6.9 Wavelet spe tra of the byte throughput time series for Leipzig-II inbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . 212 6.10 Wavelet spe tra of the byte throughput time series for Leipzig-II outbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . 212 6.11 Wavelet spe tra of the pa ket throughput time series for Leipzig-II inbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . 215 6.12 Wavelet spe tra of the pa ket throughput time series for Leipzig-II outbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . 215 6.13 A tive onne tion time series for Leipzig-II and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 6.14 Byte throughput time series for UNC 1 PM inbound and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 6.15 Byte throughput time series for UNC 1 PM outbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 6.16 Pa ket throughput time series for UNC 1 PM inbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 xvii 6.17 Pa ket throughput time series for UNC 1 PM outbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 6.18 Byte throughput marginals for UNC 1 PM inbound and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 6.19 Byte throughput marginals for UNC 1 PM outbound and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 6.20 Pa ket throughput marginals for UNC 1 PM inbound and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 6.21 Pa ket throughput marginals for UNC 1 PM outbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 6.22 Wavelet spe tra of the byte throughput time series for UNC 1 PM inbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . 225 6.23 Wavelet spe tra of the byte throughput time series for UNC 1 PM outbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . 225 6.24 Wavelet spe tra of the pa ket throughput time series for UNC 1 PM inbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . 228 6.25 Wavelet spe tra of the pa ket throughput time series for UNC 1 PM outbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . 228 6.26 A tive onne tion time series for UNC 1 PM and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 6.27 Byte throughput time series for UNC 1 AM inbound and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234 6.28 Byte throughput time series for UNC 1 AM outbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234 6.29 Pa ket throughput time series for UNC 1 AM inbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 6.30 Pa ket throughput time series for UNC 1 AM outbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 6.31 Byte throughput marginals for UNC 1 AM inbound and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 6.32 Byte throughput marginals for UNC 1 AM outbound and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 6.33 Pa ket throughput marginals for UNC 1 AM inbound and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 xviii 6.34 Pa ket throughput marginals for UNC 1 AM outbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 6.35 Wavelet spe tra of the byte throughput time series for UNC 1 AM inbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . 240 6.36 Wavelet spe tra of the byte throughput time series for UNC 1 AM outbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . 240 6.37 Wavelet spe tra of the pa ket throughput time series for UNC 1 AM inbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . 241 6.38 Wavelet spe tra of the pa ket throughput time series for UNC 1 AM outbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . 241 6.39 A tive onne tion time series for UNC 1 AM and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 6.40 Byte throughput time series for UNC 7:30 PM inbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 6.41 Byte throughput time series for UNC 7:30 PM outbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 6.42 Pa ket throughput time series for UNC 7:30 PM inbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 6.43 Pa ket throughput time series for UNC 7:30 PM outbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 6.44 Byte throughput marginals for UNC 7:30 PM inbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 6.45 Byte throughput marginals for UNC 7:30 PM outbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 6.46 Pa ket throughput marginals for UNC 7:30 PM inbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 6.47 Pa ket throughput marginals for UNC 7:30 PM outbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 6.48 Wavelet spe tra of the byte throughput time series for UNC 7:30 PM inbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . 248 6.49 Wavelet spe tra of the byte throughput time series for UNC 7:30 PM outbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . 248 6.50 Wavelet spe tra of the pa ket throughput time series for UNC 7:30 PM inbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . 250 xix 6.51 Wavelet spe tra of the pa ket throughput time series for UNC 7:30 PM outbound and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . 250 6.52 A tive onne tion time series for UNC 7:30 PM and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 6.53 Byte throughput time series for Abilene-I Clev/Ipls and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 6.54 Byte throughput time series for Abilene-I Ipls/Clev and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 6.55 Pa ket throughput time series for Abilene-I Clev/Ipls and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 6.56 Pa ket throughput time series for Abilene-I Ipls/Clev and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 6.57 Byte throughput marginals for Abilene-I Clev/Ipls and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 6.58 Byte throughput marginals for Abilene-I Ipls/Clev and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 6.59 Pa ket throughput marginals for Abilene-I Clev/Ipls and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 6.60 Pa ket throughput marginals for Abilene-I Ipls/Clev and its four types of sour elevel tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 6.61 Wavelet spe tra of the byte throughput time series for Abilene-I Clev/Ipls and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . 257 6.62 Wavelet spe tra of the byte throughput time series for Abilene-I Ipls/Clev and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . 257 6.63 Wavelet spe tra of the pa ket throughput time series for Abilene-I Clev/Ipls and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . 258 6.64 Wavelet spe tra of the pa ket throughput time series for Abilene-I Ipls/Clev and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . 258 6.65 A tive onne tion time series for Abilene-I and its four types of sour e-level tra e replay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 7.1 Bodies of the distributions of onne tion inter-arrivals for UNC 1 PM and 1 AM, and their exponential ts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 7.2 Tails of the distributions of onne tion inter-arrivals for UNC 1 PM and 1 AM, and their exponential ts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 xx 7.3 Bodies of the distributions of onne tion inter-arrivals for Abilene-I and LeipzigII, and their exponential ts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 7.4 Tails of the distributions of onne tion inter-arrivals for Abilene-I and Leipzig-II, and their exponential ts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 7.5 Average o ered load vs. number of onne tions for 1,000 Poisson resamplings of UNC 1 PM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 7.6 Histogram of the average o ered loads in 1,000 Poisson resamplings of UNC 1 PM.268 7.7 Tails of the distributions of onne tion sizes for UNC 1 PM. . . . . . . . . . . . . 270 7.8 Analysis of the a ura y of onne tion-driven Poisson Resampling from 6,000 resamplings of UNC 1 PM (1,000 for ea h target o ered load). . . . . . . . . . . 270 7.9 Comparison of average o ered load vs. number of onne tions for 1,000 onne tiondriven Poisson resamplings and 1,000 byte-driven Poisson resamplings of UNC 1 PM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 7.10 Histogram of the average o ered loads in 1,000 byte-driven Poisson resamplings of UNC 1 PM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 7.11 Analysis of the a ura y of byte-driven Poisson Resampling from 4,000 resamplings of UNC 1 PM (1,000 for ea h target o ered load). . . . . . . . . . . . . . . 273 7.12 Analysis of the a ura y of byte-driven Poisson Resampling using sour e-level tra es replay: replays of three separate resamplings of UNC 1 PM for ea h target o ered load, illustrating the s aling down of load from the original 177.36 Mbps. 274 7.13 Analysis of the a ura y of byte-driven Poisson Resampling using testbed experiments: replay of one resampling of UNC 1 AM for ea h target o ered load, illustrating the s aling up of load from the original 91.65 Mbps. . . . . . . . . . . 274 7.14 Conne tion arrival time series for UNC 1 PM (dashed line) and a Poisson arrival pro ess with the same mean (solid line). . . . . . . . . . . . . . . . . . . . . . . . 275 7.15 Conne tion arrival time series for UNC 1 AM and a Poisson arrivals pro ess with the same mean. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 7.16 Wavelet spe tra of the onne tion arrival time series for UNC 1 PM and a Poisson arrival pro ess with the same mean. . . . . . . . . . . . . . . . . . . . . . . . . . 276 7.17 Wavelet spe tra of the onne tion arrival time series for UNC 1 AM and a Poisson arrival pro ess with the same mean. . . . . . . . . . . . . . . . . . . . . . . . . . 276 7.18 Blo k resamplings of UNC 1 PM: impa t of di erent blo k lengths on the wavelet spe trum of the onne tion arrival time series. . . . . . . . . . . . . . . . . . . . . 279 7.19 Blo k resamplings of UNC 1 AM: impa t of di erent blo k lengths on the wavelet spe trum of the onne tion arrival time series. . . . . . . . . . . . . . . . . . . . . 280 xxi 7.20 Blo k resamplings of UNC 1 PM: average o ered load vs. number of onne tion ve tors (left) and orresponding histograms of average o ered loads (right) in 3,000 resamplings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282 7.21 Wavelet spe tra of several random subsamplings of the onne tion ve tors in UNC 1 PM (left) and 1 AM (right) . . . . . . . . . . . . . . . . . . . . . . . . . . 283 7.22 Analysis of the a ura y of byte-driven Blo k Resampling using sour e-level tra e replay: replays of two separate resamplings of UNC 1 PM for ea h target o ered load, illustrating the s aling down of load from the original 177.36 Mbps. . . . . 285 7.23 Analysis of the a ura y of byte-driven Blo k Resampling using sour e-level tra e replay: replay of one resampling of UNC 1 AM for ea h target o ered load, illustrating the s aling up of load from the original 91.65 Mbps. . . . . . . . . . . 285 7.24 Wavelet spe tra of the pa ket arrival time series for UNC 1 PM and the sour elevel tra e replays of two blo k resamplings of this tra e. . . . . . . . . . . . . . . 286 7.25 Wavelet spe tra of the pa ket arrival time series for UNC 1 PM and the sour elevel tra e replays of three Poisson resamplings of this tra e. . . . . . . . . . . . . 286 xxii LIST OF ABBREVIATIONS ACK Positive a knowledgment TCP segment ADU Appli ation Data Unit API Appli ation Programming Interfa e AQM A tive Queue Management BGP Border Gateway Proto ol BPF Berkeley Pa ket Filter C.I. Con den e Interval CCDF Complementary Cumulative Distribution Fun tion CDF Cumulative Distribution Fun tion DAG Data A quisition and Generation FIFO First-In First-Out FIN TCP ontrol ag indi ating \no more data from sender". FTP File Transfer Proto ol GB Gigabyte GPS Global Positioning System HTML HyperText Markup Language HTTP HyperText Transfer Proto ol I/O Input/Output ICMP Internet Control Message Proto ol IP Internet Proto ol IRC Internet Relay Chat ISP Internet Servi e Provider K-S Kolmogorov-Smirnov test KB Kilobyte Kpps Kilo pa ket per se ond LRD Long-Range Dependen e xxiii MB Megabyte MIME Multipurpose Internet Mail Extensions MSS Maximum Segment Size MTU Maximum Transmission Unit Mbps Megabit per se ond NNTP Network News Transfer Proto ol OSTT One-Side Transit Time PMA Passive Measurement and Analysis Q-Q Quantile-Quantile RED Random Early Dete tion RFC Request For Comments RST TCP ontrol ag indi ating \ onne tion reset". RTT Round-Trip Time SMTP Simple Mail Transfer Proto ol SSH Se ure Shell SYN Syn hronize TCP ontrol segment SYN-ACK Positive a knowledgement of SYN segment TCP Transport Control Proto ol UDP User Datagram Proto ol UNC University of North Carolina at Chapel Hill URL Universal Resour e Lo ator xxiv CHAPTER 1 Introdu tion As far as the laws of mathemati s refer to reality, they are not ertain; and as far as they are ertain, they do not refer to reality. | Albert Einstein (1879{1955) Humankind annot stand very mu h reality. | T. S. Elliot (1888{1965) Resear h in networking has to deal with the extreme omplexity of many layers of te hnology intera ting with ea h other in frequently unexpe ted ways. As a onsequen e, there is a broad onsensus among resear hers that purely theoreti al analysis is not enough to demonstrate the e e tiveness of network te hnologies. More often than not, areful experimentation in simulators and network testbeds under ontrolled onditions is needed to validate new ideas. Every resear her therefore fa es, at some point or another, the need to design realisti networking experiments, and syntheti network traÆ is a foremost element of these experiments. Syntheti network traÆ represents not only the workload of a omputer network, but also the dire t or indire t target of any optimization. For instan e, ongestion ontrol resear h fo uses on preserving as mu h as possible the ability of a network to transfer data in the fa e of overload. Therefore, evaluating a new ongestion ontrol me hanism in a transport proto ol su h as the Transport Control Proto ol (TCP) [Pos81℄ usually requires onstru ting experiments in whi h a number of network hosts ex hange data using this proto ol in an environment with one or more saturated links. The value of the new me hanism is then expressed as a fun tion of the performan e of these data ex hanges. For example, the new me hanism may be optimized for a hieving a higher overall throughput or a more fair allo ation of bandwidth. A fundamental insight, whi h provides the main motivation for this dissertation, is that the hara teristi s of syntheti traÆ have a dramati impa t on the out ome of networking experiments. For example, a new me hanism that improves the throughput of bulk, long-lasting le transfers in a ongested environment may not improve and may even degrade the response time of the small data ex hanges in web traÆ . This was pre isely the ase of Random Early Dete tion (RED), an A tive Queue Management (AQM) me hanism. The original analysis by Floyd and Ja obson [FJ93a℄ learly demonstrated the bene ts of RED over the basi FirstIn First-Out (FIFO) queuing me hanism for bulk transfers. In this study, RED queues were exposed to a small number (2{4) of large le transfers. However, a later experimental study by Christiansen et al. [CJOS00℄ showed that this rst AQM me hanism degraded the performan e of web traÆ in highly ongested environments. In ontrast to the original evaluation, web traÆ mostly onsists of a very large number of small data transfers, whi h reate a very di erent workload. The emergen e of the web learly hanged the nature of Internet traÆ , and made it ne essary to revisit existing results obtained under di erent workloads. The systemati evaluation of network me hanisms must therefore in lude experiments overing the wide range of traÆ hara teristi s observed on Internet links. It is riti al to provide the resear h ommunity with methods and tools for generating syntheti traÆ as representative as possible of this range of hara teristi s. The on ept of sour e-level modeling introdu ed by Paxson and Floyd [PF95℄ onstitutes a major in uen e on this dissertation. These authors advo ated for building models of the behavior of Internet appli ations (i.e., the sour es of Internet traÆ ), and generating traÆ in networking experiments by driving network sta ks with these appli ation models. The main bene t of this approa h is that traÆ is generated in a losed-loop manner, whi h fully preserves the fundamental feedba k loop between network endpoints and network hara teristi s. For example, a model of web traÆ an be used to generate traÆ using TCP/IP network sta ks, and the generated traÆ will properly rea t to di erent levels of ongestion in networking experiments. In ontrast, open-loop traÆ generation is asso iated to models of the pa ket arrivals on network links, and these models are insensitive to hanges in network onditions, and 2 tied to the original onditions under whi h they were developed. This makes them inappropriate for experimental studies that hange these onditions. The main motivation of our work is to address one important diÆ ulty with sour e-level modeling. In the past, sour e-level modeling has been asso iated with hara terizing the behavior of individual appli ations. While this approa h an result in high-quality models, it is a diÆ ult pro ess that requires a large amount of e ort. As a onsequen e, only a small number of models is available, and they are often outdated. This is in sharp ontrast to the traÆ observed in most Internet links, whi h is driven by ri h traÆ mixes omposed of a large number of appli ations. Sour e-level modeling of individual appli ations does not s ale to modern traÆ mixes, making it very problemati for networking resear hers to ondu t representative experiments with losed-loop traÆ . This dissertation presents a new methodology for generating network traÆ in testbed experiments and software simulations. We make three main ontributions. First, we develop a new sour e-level model of network traÆ , the a-b-t model , for des ribing in a generi and intuitive manner the behavior of the appli ations driving TCP onne tions. Given a pa ket header tra e olle ted at an arbitrary Internet link, we use this model to des ribe ea h TCP onne tion in the tra e in terms of data ex hanges and quiet times, without any knowledge of the a tual semanti s of the appli ation. Our algorithms make it possible to eÆ iently derive empiri al hara terizations of network traÆ , redu ing modeling times from months to hours. The same analysis an be used to in orporate network-level parameters, su h as round-trip times, to the des ription of ea h onne tion, providing a solid foundation for traÆ generation. Se ond, we propose a traÆ generation method, sour e-level tra e replay , where traÆ is generated by replaying the observed behavior of the appli ations as sour es of traÆ . This is therefore a method for generating entire traÆ mixes in a losed-loop manner. One ru ial bene t of our method is that it an be evaluated by dire tly omparing an original tra e and its sour e-level replay. This makes it possible to systemati ally study the realism of syntheti traÆ , in the terms of how well our des ription of the onne tions in the original traÆ mix re e ts the nature of the original traÆ . In addition, this kind of omparison provides a means 3 !" "" "! #! $ % ! & % & " & ' #! $ %! & ( ! % & )) & ! % & *+ , ,. / ( ! 0 1 .233 (4 5 .633 (4 633 ( 7 .63 3 ( 8 & " Figure 1.1: Network traÆ seen from di erent levels. to understand the impa t that the di erent hara teristi s of a traÆ mix have on spe i tra es and on Internet traÆ in general. Third, we propose and study two approa hes for introdu ing variability in the generation pro ess and s aling (up or down) the level of traÆ load in the experiments. These operations greatly in rease the exibility of our approa h, enabling a wide range of experimental investigations ondu ted using our traÆ generation method. 1.1 Abstra t Sour e-Level Modeling This dissertation presents a methodology for generating syntheti network traÆ that addresses some of the main short omings of existing te hniques. Figure 1.1 illustrates the levels of detail at whi h Internet traÆ an be studied, providing a good starting point for framing our dis ussion. We fo us on the traÆ on a single Internet link, su h as the one between the University of North Carolina at Chapel Hill (UNC) and the Internet. We an study the traÆ in this link at di erent levels of detail. The top-most time-line represents traÆ observed in the link between UNC and the Internet as a sequen e of pa ket arrivals. This level of detail 4 is known as the aggregate pa ket arrival level. Here pa kets from many di erent onne tions were interleaved reating a omplex arrival pro ess in the network link. In general, TCP traÆ a ounts for the vast majority of the pa kets on Internet links (usually between 90% and 95%), whi h justi es our fo us on TCP in this work. The se ond time-line depi ts the pa ket arrivals that belonged to a single TCP onne tion. These pa kets were used to send data ba k and forth between two network endpoints, one lo ated at UNC, and the other one somewhere on the Internet. The sour es of these data are appli ations running on the endpoints, whi h rely on the pa ket swit hing servi e provided by the Internet to ommuni ate. Prominent examples of these appli ations are the World Wide Web, email, le sharing, et . Hundreds of di erent appli ations are ommonly found on Internet links. The traÆ observed at an Internet link is therefore the result of multiplexing the ommuni ation of a large number of endpoints driven by a wide range of appli ations. This dissertation onsiders the problem of generating traÆ in networking experiments that preserves both the aggregate-level and the onne tion-level properties of traÆ observed in a real network link. Note that we restri t ourselves to this most basi form of the problem where only a single link is onsidered both for observing traÆ and for reprodu ing it in networking experiments. Our ndings an ertainly be applied to a broader ontext, e.g., multiple links along a path following the \parking lot topology" [PF95℄, links in an ISP, et ., but we hoose to keep to this problem in its most essential form throughout this dissertation. As mentioned before, every onne tion on the Internet is driven by an appli ation ex hanging data between two endpoints. It is therefore possible to examine traÆ at a higher-level, where the ommuni ation is des ribed in terms of appli ation data units (ADUs) rather than network pa kets. This appli ation level is illustrated in the bottom time-line of Figure 1.1, whi h reveals that the sour e of the pa kets in the se ond time-line was the ex hange of data between a web browser and a web server using a TCP onne tion. The time-line shows a rst ADU of 2,500 bytes, whi h arried a request for an HTML page. The way the data is organized within this ADU and its meaning is given by the spe i ation of the HyperText Transfer Proto ol (HTTP) [FGM+97℄, whi h standardizes the ex hange of data between web browsers and web servers. 5 The time-line shows a se ond ADU, sent by the web server to the web browser in response to the rst ADU. It arried the a tual HTML sour e ode of the page requested by the browser. Its size was 4,800 bytes, whi h in luded not only the HTML sour e ode but also an appropriate HTTP header. The time-line shows another pair of ADUs that also orresponded to an HTTP request and an HTTP response, whi h this time arried an image le. Ea h ADU is asso iated to one or more pa kets in the se ond time-line. The amount of data in these ADUs and its meaning was de ided by the appli ation, while the a tual number of pa kets, their sizes, the need for retransmissions, et ., were de ided by lower layers (transport and below). The appli ation level provides the starting point for the traÆ modeling and generation methodology developed in this dissertation. Our approa h to traÆ generation relies on the notion of sour e-level modeling , advo ated by Paxson and Floyd [FP01℄. Rather than dire tly generating pa kets a ording to some tra e or some pa ket arrival model, sour e-level modeling involves simulating the behavior of the appli ations running on the endpoints and allowing lower layers to ontrol the a tual ex hange of pa kets. For example, generating traÆ with a sour e-level model of web traÆ means to simulate web browsers and web servers a ording to statisti al models of web page sizes, the durations of user think times and other sour e-level parameters [Mah97, BC98, SHCJO01℄. Modeling traÆ at the sour e level produ es des riptions of traÆ that are mostly independent of the underlying proto ols and network onditions, so they an be used to drive traÆ generation in experiments that modify these same proto ols and onditions. For this reason, sour e-level models are also known as network-independent model . For example, the size of an HTML page arried in a TCP onne tion does not hange with the degree of ongestion (it always has the same number of hara ters). Therefore, its size is a network-independent property. Lower-level des riptions of traÆ , su h as hara terizations of pa ket arrivals, are network dependent . For example, the rate at whi h the pa kets of a TCP onne tion arrive de reases as the degree of ongestion in reases, sin e TCP uses a ongestion ontrol algorithm that dereases the sending rate as the loss rate in reases. Also, pa ket losses for e TCP endpoints to perform retransmissions. This means that the transmission of the same amount of data at the 6 sour e-level (e.g., an HTML page) at di erent times may require di erent numbers of pa kets to be transferred, depending on the number of lost pa kets. A sour e-level model des ribes the sizes of ADUs, but not the times at whi h a onne tion should lower its sending rate or retransmit a pa ket. For this reason, the same model an be used to generate traÆ under di erent network onditions, su h as low and high levels of ongestion. Endpoints generating traÆ using these models are able to adapt to ea h spe i set of network onditions in the experiments. This preserves the fundamental feedba k loop that exists between endpoints and network onditions. For this reason, this type of traÆ generation is said to be losed-loop. On the ontrary, traÆ generated a ording to lower level models is ne essarily open-loop. For example, t preplay [t pb℄ an be used to reena t the sending of every pa ket re orded in a tra e, whi h results in open-loop traÆ that is insensitive to the underlying network onditions. This traÆ is inappropriate for experiments where network onditions are important, su h as the evaluation of ongestion ontrol me hanisms. In the past, sour e-level modeling has been onsidered a synonym of appli ation modeling, so resear hers have developed a number of appli ation-spe i models in luding models for web traÆ , le transferring and other individual appli ations. This approa h is good if one is interested in the traÆ generated by a single appli ation (or by a handful of appli ations). However, if one is interested in realisti traÆ mixes, appli ation-spe i traÆ modeling has some important short omings. The rst problem is that appli ation spe i modeling does not s ale well to the large number of appli ations that form ontemporary traÆ mixes. For example, the weekly traÆ report from Internet2 [Con04℄ olle ts separate statisti s for more than 80 di erent appli ations that make up Internet2 traÆ . Using existing te hnology, it is simply too timeonsuming to develop and populate individual models for ea h appli ation. Moreover, even if we had the resour es to examine the behavior of all appli ations, many appli ations use proprietary proto ols, so painstaking reverse engineering is needed to understand and model their behavior. In addition, Internet traÆ evolves qui kly, sin e new appli ations and improved versions of the existing ones appear very frequently. This dissertation proposes a more general solution to the sour e-level modeling and the 7 ! "#$ %& $ ' ! "#$ %& $ ( ) * ( ) * ) + ) + ) + Figure 1.2: An a-b-t diagram illustrating a persistent HTTP onne tion. traÆ generation problems. We develop an abstra t model of network data ex hange wherein ea h onne tion is des ribed independently of the semanti s of the appli ation initiating the onne tion. This idea is illustrated in the third time-line of Figure 1.1. Here the ommuni ation is des ribed in generi terms, simply as a sequen e of ADU ex hanges between the two endpoints of the TCP onne tion, without atta hing any meaning to the ADUs. Other generi hara teristi s of traÆ in lude the dire tion in whi h the ADUs are sent, from the onne tion initiator or from the onne tion a eptor, and the duration of quiet times between ADUs, whi h are due to user behavior and pro essing times. These hara teristi s an generally be used to des ribe the behavior of any spe i appli ation. For example, the ADUs of web traÆ are HTTP requests and responses, while the inter-ADU times are user think times and server pro essing times. The ru ial observation is that the sizes of ADUs and the times between them an be measured from the pa ket tra es of two onne tions without knowledge of the behavior of the appli ation driving the onne tion. This makes it possible to onstru t a sour e-level des ription of the entire set of onne tions observed in a measured link, instead of only the onne tions driven by one or a few well-known appli ations. Any tra e of pa kets traversing a network link an be transformed into an abstra t sour e-level tra e, without examining the payload of the pa kets and without instrumenting the endpoints. Our approa h to sour e-level modeling results in an abstra t representation of a TCP onne tion using a notation that we all an a-b-t onne tion ve tor . We also refer to this idea as the a-b-t model , in the sense that it provides a mental model for understanding network traÆ 8 at the sour e level, rather than in the sense of a mathemati al or statisti al model1. The term a-b-t is des riptive of the basi building blo ks of this model: a-type ADUs (a's), whi h are sent from the onne tion initiator to the onne tion a eptor, b-type ADUs (b's), whi h ow in the opposite dire tion, and quiet times (t's), during whi h no data segments are ex hanged. We will make use of these terms to des ribe the sour e-level behavior of TCP onne tions throughout this dissertation. Our a-b-t model has a sequential version and a on urrent version. The sequential version applies to onne tions where the endpoints follow a stri t order in their ex hange of ADUs. In this version, a TCP onne tion is des ribed by a ve tor of epo hs (e1; e2; : : : ; en). Ea h epo h has the form ej = (aj ; taj ; bj ; tbj), where aj is the size of an ADU sent from the onne tion initiator to the onne tion a eptor, bj is the size of an ADU sent in the opposite dire tion, and taj and tbj are inter-ADU quite times (during whi h the endpoints are idle). We all this representation of sour e-level behavior a sequential onne tion ve tor . For example, the onne tion illustrated in Figure 1.2 is represented as ((329; 0; 403; 0:12); (403; 0; 25821; 3:12); (356; 0; 1198; 15:3)) using the sequential a-b-t model. This onne tion has three epo hs, ea h arrying one HTTP request/response pair. The rst epo h has an ADU a1 of size 329 bytes, whi h was sent from the onne tion initiator (a web browser) to the onne tion a eptor (a web server), and an ADU b1 of size 803 bytes, whi h was sent in the opposite dire tion. We also observe some quiet times between the ADUs, su h tb2, whi h had a duration of 3.12 se onds. While Figure 1.2 in ludes labels for HTTP requests, responses and do uments, our a-b-t notation is ompletely generi . We onsider this TCP onne tion sequential be ause only one endpoint sent data to the other one at any point in the lifetime of the onne tion. It is important to iterate that an ADU is not a TCP segment (i.e., TCP pa ket), but an appli ation message that is independent of its 1Our a-b-t model provides however a good foundation for developing mathemati al and statisti al models of traÆ at the sour e-level. This dissertation onsistently follows a non-parametri approa h to traÆ modeling. The only ex eption is the Poisson Resampling method presented in Chapter 7, for whi h we also o er a more powerful non-parametri alternative, blo k resampling.9 ! ! " ! # ! $ ! % " # $ % Figure 1.3: A diagram illustrating the intera tion between two BitTorrent peers. a tual network representation as a link-level pa ket. As su h, an ADU an be of arbitrary size, like the smaller a1 = 329 bytes and the larger b2 = 25; 821 bytes in the previous example. The transferring of a1 would usually involve a single TCP segment, but it is also possible that this segment gets dupli ated, or lost and then retransmitted. In this ase, the TCP endpoint sending a1 would result in the generation of two or more segments arrying this ADU. Our notation would still des ribe this part of the TCP onne tion as a single 329-byte ADU, and not as the sequen e of TCP segments used to transfer the data. Similarly, transferring b2 = 25; 821 bytes requires a minimum of 18 TCP segments in a path without loss and with a regular Maximum Segment Size (MSS) of 1,460 bytes (the one derived from Ethernet's Maximum Transmission Unit (MTU) of 1,500 bytes, after subtra ting 20 bytes for the IP header and 20 bytes for the TCP header). It may require many more segments in a lossy environment, or in a path with a lower MTU. However, these details are irrelevant at the abstra t sour e level, where b2 aptures the need of one of the endpoints to send 25,821 bytes of data, and this need is independent of the way in whi h the data is transferred by the network. Our modeling is therefore networkindependent, whi h makes it suitable for generating losed-loop traÆ . While most TCP onne tions are driven by appli ations that follow a sequential pattern of ADU ex hanges, we an also nd ases in whi h the two endpoints send data to ea h other at the same time. This is illustrated in Figure 1.3 using a BitTorrent [Coh03℄ onne tion, where we an see ADUs whose transmission overlaps in time (i.e., the ADUs are ex hanged on urrently). This pattern is ertainly less ommon that the sequential one, but it is supported in important proto ols like HTTP/1.1 (pipelining), NNTP (streaming mode) and BitTorrent. Our analysis shows that while the fra tion of onne tions with on urrent data ex hanges is usually small, (17.4%), su h on urrent onne tions often arry a signi ant fra tion (15%-35%) of the total 10 bytes seen in a tra e, and hen e modeling these onne tions is riti al if one wants to generate realisti traÆ mixes. To represent on urrent ADU ex hanges, the a tions of ea h endpoint are onsidered to o ur independently of ea h other. Thus ea h endpoint is a separate sour e generating ADUs that appear as a sequen e of epo hs following a unidire tional ow pattern. Formally, this means that we represent ea h onne tion as a pair ( ; ) of onne tion ve tors of the form = ((a1; ta1); (a2; ta2); : : : ; (ana ; tana)) and = ((b1; tb1); (b2; tb2); : : : ; (bnb ; tbnb)); where ai and bi are sizes of ADUs sent from the initiator and from the a eptor of the TCP onne tion respe tively, and tai and tbi are quiet times between the ADUs. We all this representation of sour e-level behavior a on urrent onne tion ve tor . Unlike the sequential version of the a-b-t model, this representation does not apture any ausality between the two dire tions of a TCP onne tion. As a onsequen e, traÆ generated a ording to this version of the model usually exhibits a substantial number of on urrent data ex hanges. The a-b-t model provides a simple yet expressive way of des ribing sour e-level behavior in a generi manner that is not tied to the details of any appli ation. In addition, this non-parametri model was designed to in orporate quantities (ADU sizes, ADU dire tionality, and inter-ADU quiet time duration) that an be extra ted from pa ket header tra es in a eÆ ient, a urate manner. We an easily imagine more omplex and expressive models of TCP onne tions for whi h no eÆ ient data a quisition algorithm exists, or models that deal with hara teristi s of sour e-level behavior that annot be extra ted purely from pa ket headers. In the ase of the ab-t model, we have developed a data a quisition algorithm that relies on TCP sequen e numbers for measuring ADU sizes, and on the pa ket arrival timestamps obtained during tra e olle tion to determine inter-ADU quite times. Our algorithm onstru ts a data stru ture in whi h TCP segments are ordered a ording to their logi al data order , i.e., the order in whi h data must 11 Tmix Traffic Generators Tmix Traffic Generators Trace Partitioning TESTBED Original Packet Header Trace Th i i l t Original Connection Vectors Tc i i l ti t c Trace Analysis Generated Packet Header Trace Th′ t t ′ Replayed Connection Vectors Tc′ l ti t c′ Trace Analysis Figure 1.4: Overview of Sour e-level Tra e Replay. be delivered to the appli ation layer of the re eiving endpoint. In re onstru ting this logi al order for ea h onne tion, we have developed methods for dealing with network pathologies su h as arbitrary segment reordering, dupli ation and retransmission. Furthermore, when the data segments in a TCP onne tion annot be ordered a ording to the logi al data order, we an lassify the onne tion as on urrent with ertainty. Our data stru ture supports both sequential (i.e., bidire tional) and on urrent (i.e., unidire tional) ordering, making it possible to extra t ADU sizes and quiet times with a single pass over the segments of a TCP onne tion found in a tra e. The analysis an be performed in O(sW ) time, where s is the number of data segments in the onne tion and W is the maximum size of the TCP window (whi h bounds the maximum amount of reordering). 1.2 Sour e-Level Tra e Replay Our abstra t sour e-level modeling of TCP onne tion provides a solid foundation for generation traÆ mixes in simulators and network testbeds. We propose to generate traÆ using sour e-level tra e replay , as illustrated in Figure 1.4. Given a pa ket header tra e Th olle ted from some Internet link, we rst use our data a quisition algorithm to analyze the tra e and des ribe its ontent as a olle tion of onne tion ve tors T = f(Ti; Ci)g, where Ti is the relative 12 start time of the i-th TCP onne tion, and Ci is the sequential or on urrent onne tion ve tor orresponding to this onne tion. The basi approa h for generating traÆ a ording to T is to replay every onne tion ve tor Ci. Ea h onne tion ve tor Ci is replayed by starting a TCP onne tion pre isely at Ci's relative start time Ti, and transmitting the measured sequen e of ADUs (aj and bj) separated in time by the inter-ADU measured quiet times (tai and tbi). In this dissertation, we evaluate a spe i implementation of this approa h for FreeBSD network testbeds, where traÆ is generated using a tool we developed alled tmix . The goal of the dire t sour e-level tra e replay of T is to reprodu e the sour e-level hara teristi s of the traÆ in the original link, generating the traÆ in a losed-loop fashion. Closed-loop traÆ generation implies the need to simulate the behavior of appli ations, using regular network sta ks to a tually translate sour e-level behavior into network traÆ . In parti ular, our experiments use an implementation whi h relies on the standard so ket interfa e to reprodu e the data ex hanges in ea h onne tion ve tor. Generating traÆ in this manner is losed-loop in the sense that it preserves the feedba k me hanism in TCP, whi h adapts its behavior to hanges in network onditions, su h as loss and re eiver saturation. In ontrast, pa ket-level tra e replay, the dire t reprodu tion of Th, is an open-loop traÆ generation method in the sense that TCP ontrol algorithms are not used during the generation, and hen e the traÆ does not adapt to network onditions. The evaluation of our methodology onsists of omparing the original tra e Th and the syntheti tra e T 0 h obtained from the sour e-level tra e replay. Validating our traÆ generation method onsists of transforming T 0 h into a set of onne tion ve tors T 0 , using the same method used to transform Th into T . We then ompare the resulting set of onne tion ve tors T 0 with the original T . In prin iple, they should be identi al, sin e T represents the invariant sour elevel hara teristi s of Th. There are however some di eren es that are explained by the nature of the model and our measurement methods. The dire t omparison of Th and T 0 h also provides a way to study the a ura y of our approa h in terms of how well traÆ is des ribed by the a-b-t model. This is however a subtle exer ise. The a tual replay of T , whi h reates T 0 h, ne essarily requires the sele tion of a 13 a set of network-level parameters, su h as round-trip times and TCP re eiver window sizes, for ea h TCP onne tion in the sour e-level tra e replay. The exa t set of generated TCP segments and their arrival times is a dire t fun tion of these parameters. As a onsequen e, if we ondu t a sour e-level tra e replay using arbitrary network-level parameters, we obtain a T 0 h with little resemblan e to the original Th. The replayed a-b-t onne tion ve tors may be a perfe t des ription of the sour e behavior driving the original onne tions, but the generated pa ketlevel tra e T 0 h would still be very di erent from the original Th. To address this diÆ ulty, our replay in orporates network-level parameters individually derived from ea h onne tion in Th. We have also in orporated methods for measuring three important network-level parameters (round-trip time, TCP re eiver window size and loss rate) into our analysis and generation pro edure. While this set of parameters is by no means omplete, it does in lude the main parameters that a e t the average throughput of a TCP onne tion found in a tra e. This enables us to generate traÆ in a losed-loop manner that approximates measured tra es very losely. In orporating network-level properties is important, but it is riti al to understand the main short oming of this approa h. The goal of our work is not to make the generated traÆ T 0 h identi al to the original traÆ Th, whi h ould be a omplished with a simple pa ket-level replay. As mentioned before, pa ket-level replays generate traÆ that does not adapt to hanges in network onditions, resulting in open-loop traÆ . Our goal is to develop a losed-loop traÆ generation method based on a detailed hara terization of sour e behavior. TraÆ generated in a losed-loop manner an adapt to di erent network onditions, whi h are intrinsi when evaluating di erent network me hanisms. Our omparison of Th and T 0 h is only a means to understand the quality of traÆ generation method, where quality is onsidered to be higher as the original tra e is more losely approximated. If enough parameters of the original traÆ are a urately measured and in orporated into the traÆ generation experiment, we expe t to observe a great similarity between Th and T 0 h. On the ontrary, if we are missing some important parameters, we expe t to observe substantial di eren es between tra es. By onstru tion, traÆ generated using sour e-level tra e replay an never be identi al to 14 the original traÆ . The statisti al properties of original pa ket header tra es are the result of multiplexing a large number of onne tions onto a single link, and these onne tions traverse a large number of di erent paths with a variety of network onditions. It is simply not possible to fully hara terize this environment and reprodu e it in a laboratory testbed or in a simulation. This is both be ause of the limitations of passive inferen e from pa ket headers, and be ause of the sto hasti nature of network traÆ . Sour e-level tra e replay an never in orporate every fa tor that shaped Th, and therefore di eren es between Th and T 0 h are unavoidable. Still, nding a lose mat h between an original tra e and its replay, even if they are not identi al, onstitutes strong eviden e of the a ura y of the a-b-t model and the data a quisition and generation methods we have developed. It also demonstrates the feasibility of generating realisti network traÆ in a losed-loop manner that resembles a ri h traÆ mix. 1.3 Tra e Resampling and Load S aling As long as the network setup of a simulation or testbed experiment remains un hanged, the sour e-level tra e replay of a onne tion ve tor tra e T = f(Ti; Ci)g always results in traÆ that is similar to the original tra e. Every replay ontains the same number of TCP onne tions behaving a ording to the same onne tion ve tor spe i ation and starting at the same times. Only tiny variations are introdu ed on the end-systems by hanges in lo k syn hronization, operating system s heduling and interrupt handling, and at swit hes and routers by the sto hasti nature of pa ket multiplexing. Sour e-level tra e replay has therefore two desirable properties: The quality of the syntheti traÆ an be evaluated by dire tly omparing syntheti and original traÆ . This makes it possible to study the a ura y of the analysis methods and the generation system with omplete freedom, using any metri that an be derived from real traÆ . In ontrast, more abstra t methods based on parametri models of traÆ are inherently sto hasti and therefore more diÆ ult to evaluate. For su h methods, it is less obvious whether the observed di eren e between the traÆ generated using the parametri model and the original traÆ from whi h the model derives should be admitted. 15 The generation of the syntheti traÆ is fully reprodu ible. A resear her an expose a olle tion of network proto ols and me hanisms to exa tly the same losed-loop traÆ , whi h provides the right foundation for fair omparative studies. In ontrast, sto hasti variation in the traÆ generated using parametri models is often diÆ ult to ontrol. For example, experiments with models that rely on heavy-tailed distributions onverge very slowly to omparable onditions, as dis ussed by Crovella and Lipsky [CL97℄. While these properties are important, the pra ti e of experimental networking often requires to introdu e ontrolled variability in the generated traÆ for exploring a wider range of s enarios. This motivates the development of methods that manipulate T in order to generate di erent traÆ that still resembles the original one. Furthermore, developing a statisti ally sound way of manipulating T is essential for generating traÆ with di erent levels of o ered load. This manipulation to mat h a target o ered load is a very ommon need in experimental networking resear h. This is be ause the performan e of a network me hanism or proto ol is often a e ted by the amount of traÆ to whi h it is exposed. Therefore, rigorous experimental studies frequently require to generate a omplete range of target loads. In this dissertation, we propose two exible methods for introdu ing variability in traÆ generation experiments. In both ases, the set of onne tion ve tors in T is randomly resampled, resulting in a new set T 0 that preserves the aggregate sour e-level hara teristi s of the original traÆ . In our rst method, Poisson Resampling , we onstru t a new onne tion ve tor tra e T 0 by randomly resampling onne tions from T , and assigning them exponentially distributed inter-arrival times. As a result, onne tions in T 0 arrive a ording to a Poisson pro ess. In the se ond method, Blo k Resampling , we resample blo ks (groups) of onne tions rather than individual onne tions. This method results in a more realisti onne tion arrival pro ess, whi h mat hes the substantial burstiness observed in real tra es. In more te hni al terms, Blo k Resampling preserves the moderate long-range dependen e found in real onne tion arrival pro esses, while Poisson Resampling results in a short-range dependent onne tion arrivals pro ess. This di eren e is demonstrated in our experimental evaluation of the two methods. In addition, the evaluation shows that the duration of the resampling blo k reates a trade16 o between shorter blo ks (whi h in rease the number of distin t resamplings) and long-range dependen e (whi h disappears for short blo ks). Our analysis demonstrates that blo k durations between 1 and 5 minutes o er the best ompromise. Resear hers often need to ondu t a set of experiments with a range of di erent traÆ loads. When using a traditional sour e-level model, e.g., a model of web traÆ , resear hers have to rst ondu t a preliminary experimental study to determine how the parameters of the model, e.g., the number of user equivalents, a e t the generated load [CJOS00, LAJS03, K LH+02℄. This is usually known as the alibration of traÆ generator. Our resampling methods eliminate this ommon need for alibrating traÆ generators, sin e the resampling pro ess an be ontrolled to mat h a spe i target load (i.e., generated load is known a priori). In the ase of Poisson Resampling, this is a omplished by hanging the mean arrival rate of onne tions. In the ase of Blo k Resampling, o ered load is manipulated using blo k thinning (i.e., subsampling) and blo k thi kening (i.e., ombining blo ks). Our work reveals that load s aling annot be based simply on ontrolling the number of onne tions. Su h an approa h frequently results in o ered loads that are far from the target, be ause the number of onne tions in a resample is not strongly orrelated with the o ered load represented by these onne tions. We address this diÆ ulty by developing byte-driven versions of Poisson Resampling and Blo k Resampling, whi h s ale load using a running ount of the total data in the resampled tra e T 0 . Unlike the number of onne tions, the total amount of data in T 0 is strongly orrelated to traÆ load o ered by T 0 . Our experiments on rm that byte-driven resampling is highly a urate, eliminating the ommon need for alibrating traÆ generators. 1.4 Thesis Statement This dissertation onsiders the following thesis: 1. An abstra t sour e-level model an des ribe in detail the entire set of TCP appli ation behaviors observed in real networks. 17 2. Des riptions of abstra t sour e-level behavior an be empiri ally derived from pa ket header tra es in an eÆ ient, a urate manner. 3. TraÆ generation based on this abstra t sour e-level modeling results in syntheti traÆ that is realisti and suitable for experimental networking resear h. 4. The abstra t sour e-level model of a tra e an be manipulated to introdu e statisti ally valid variability in the generated traÆ and also to a urately mat h a target o ered load while preserving appli ation hara teristi s. 1.5 Contributions We highlight the following ontributions from this dissertation: We develop the on ept of abstra t sour e-level modeling and the a-b-t notation for des ribing the sour e-level behavior of entire traÆ mixes. We identify a fundamental dihotomy in sour e-level behavior between onne tions that ex hange data sequentially and onne tions that ex hange data on urrently. Our a-b-t notation in ludes a sequential version and a on urrent version that makes it possible to appropriately des ribe these two types of behaviors. We formulate a formal test of on urren y that an be applied to the pa ket headers of any TCP onne tion, and that does not su er from false positives. This enables us to a urately lassify onne tions as sequential or on urrent. We show that only a small fra tion of TCP onne tions (less than 4% in our tra es) ex hange data on urrently, but that these TCP onne tions a ount for a substantial fra tion (up to 32%) of the total traÆ . We present an eÆ ient algorithm for transforming a pa ket header tra e into a olle tion of sequential and on urrent a-b-t onne tion ve tors. Given a TCP onne tion for whi h we observe s segments and that has a maximum re eiver window size ofW , the asymptoti 18 ost of our algorithm is O(sW ). We demonstrate that this algorithm is a urate using traÆ generated from syntheti appli ations (i.e., with known hara teristi s). We develop sour e-level tra e replay, a losed-loop traÆ generation method that uses a-bt onne tion ve tors as a non-parametri model of network traÆ . One key bene t of this approa h is the possibility of dire tly omparing original and generated traÆ , whi h we use to evaluate the \realism" of our traÆ generation approa h. This omparison requires us to in orporate some network-level parameters (round-trip times, maximum re eiver window sizes, and possibly loss rates) into the traÆ generation. These parameters an be measured from pa ket header tra es. We pay spe ial attention to passive round-trip time estimation in our data a quisition, developing the on ept of One-Side Transit Time and studying the impa t of delayed a knowledgments on passive round-trip time estimation. We implement our traÆ generation method in a network testbed, developing a new distributed traÆ generation tool, tmix . We use this implementation to study the results of a large olle tion of tra e replay experiments, evaluating the need for detailed sour e-level modeling and the impa t of losses on measured network traÆ . Our results demonstrate that detailed sour e-level modeling is often required for a urately approximating real traÆ , whi h demonstrates that sour e-level behavior is a major fa tor shaping Internet traÆ . The most substantial di eren es are observed for the number of a tive onne tions and the number of pa ket arrivals per unit of time. Byte arrivals per unit of time and long-range dependen e do not improve so onsistently with the use of detailed sour e-level modeling. We also show that losses had only a se ondary e e t in our tra es, but they are not negligible when omparing original and generated traÆ . We present two tra e resampling algorithms whi h an be used to derive new tra es from an existing one, preserving its statisti al hara teristi s at the sour e-level. Our omparison of the two methods reveals that the observed long-range dependen e in onne tion arrivals has no apparent impa t on the long-range dependen e of pa ket and byte arrivals. We demonstrate the need for byte-driven rather than onne tion-driven resampling in order to a urately s ale o ered loads, and develop byte-driven versions of our two re19 sampling methods. This approa h eliminates the need for the experimental alibration of traÆ generators (whi h study the relationship between the parameters of the generator and the o ered traÆ load). Our entire methodology makes it possible to ondu t networking experiments with losedloop syntheti traÆ derived from real tra es in an automated manner. This eliminates the need for painstaking parametri modeling. 1.6 Overview Chapter 2 presents a review of the state-of-the-art in syntheti traÆ generation. We rst expand our dis ussion of pa ket-level traÆ generation and data a quisition, and then examine sour e-level traÆ generation more in depth. We review the literature on appli ation-spe i modeling, dis ussing models of web traÆ and other appli ations, and also onsider several approa hes for generating traÆ driven by more than one appli ation. We also dis uss existing methods for ontrolling the traÆ load reated in networking experiments. The hapter nally onsiders some resear h e orts addressing implementation issues. Chapter 3 dis usses abstra t sour e-level modeling, presenting several examples of real appli ations and how their behavior an be des ribed using our a-b-t notation. We also present our measurement algorithm for transforming a pa ket header tra e into a olle tion of sequential and on urrent a-b-t onne tion ve tors. The hapter also in ludes a validation of the measurement method using syntheti appli ations, and a measurement study that examines the statisti al properties of the a-b-t onne tion ve tors extra ted from ve real tra es. Chapter 4 fo uses on network-level measurement. We rst des ribe our methods for measuring round-trip times, window sizes and loss rates, and an evaluation of their a ura y. While this set of parameters is by no means omplete, it does in lude the main parameters that a e t the average throughput of a TCP onne tion found in a tra e. The se ond part of Chapter 4 des ribes the network-level metri s that we onsider in the evaluation of our traÆ generation 20 method: pa ket and byte throughput time series, their marginal distributions, wavelet spe tra, Hurst parameter estimates and time series of a tive onne tions. Chapter 5 des ribes sour e-level tra e replay and our implementation in a network testbed. We present a validation of this implementation using the sour e-level tra e replays of ve tra es. For ea h tra e, we study the a-b-t onne tion ve tors extra ted from the original tra es and those found in replays with and without pa ket losses at the network links. The results demonstrate the a ura y of our approa h, and also un over some diÆ ulties, whi h are in some ases inherent to the a-b-t model and its passive method of data a quisition. Chapter 6 examines the results of several sour e-level tra e replay experiments. Our analysis ompares original tra es and their sour e-level tra e replays using the ri h set of metri s introdu ed in Chapter 4, revealing a remarkably lose approximation. This study also in ludes a omparison of traÆ generated with the a-b-t model and with a simpli ed version that \disables" sour e-level modeling, whi h is shown to perform well for some metri s and poorly for others. As in the previous hapter, we also onsider experiments with and without arti ial losses, showing that loss did not have a dominant impa t on the hara teristi s of the original traÆ . In general, our results provide a strong justi ation of our sour e-level modeling approa h, demonstrating that the losed-loop replay of a-b-t onne tion ve tors losely resembles real traÆ . Chapter 7 presents our two resampling methods, Poisson Resampling and Blo k Resampling. These methods enable the resear her to introdu e ontrolled variability in sour e-level tra e replay experiments, without sa ri ing reprodu ibility. In addition, we onsider the problem of load s aling, i.e., how to ontrol the resampling pro ess to obtain a new tra e with a target o ered load. Our work demonstrates that this task an be a omplished by keeping tra k of the total number of data bytes in the resampled tra e, but not by keeping tra k of the number of onne tions. Our s aling methods eliminate the ommon need for running a preliminary study to alibrate the traÆ generator. Chapter 8 presents our on lusions and dis usses future work. 21 CHAPTER 2 Related Work A s ienti theory should be as simple as possible, but no simpler. | Albert Einstein (1879{1955) The greatest hallenge to any thinker is stating the problem in a way that will allow a solution. | Bertrand Russell (1872{1970) This hapter presents an overview of the resear h literature relevant for realisti traÆ generation. We onsider two types of works. First, we dis uss the body of literature that developed the on epts and te hniques urrently in use for generating syntheti traÆ in simulations and testbed experiments. Se ond, we examine the Internet measurement literature that informs the dis ussion of what is meant by \realisti " traÆ generation. Intuitively, syntheti traÆ resembling Internet traÆ an only be realisti if derived from measurements ondu ted from real network links. We ould argue that any Internet measurement paper helps to gain a better understanding of the nature of the Internet and its traÆ , being therefore relevant for realisti traÆ generation. However, the sheer size of the Internet measurement literature makes a omplete overview impra ti al, so we will restri t ourselves to the main works that had a dire t impa t on Internet traÆ generation. It is also interesting to note that the most re ent trend in the eld of traÆ generation is pre isely to ombine traÆ measurement and generation into a single, oherent approa h [HCJS+01, LH02, SB04, HCSJ04℄. TraÆ generation for experimental networking resear h was identi ed as one of the key hallenges in Internet modeling and simulation by Paxson and Floyd [PF95℄ in 1995. Interestingly, Floyd and Kohler [FK03℄ made a similar point in 2003, and argued that it was still diÆ ult to ondu t experiments with representative, validated syntheti traÆ . While traÆ measurement and Internet measurement in general have be ome in reasingly popular in re ent years, most studies are exploratory and provide little foundation to build traÆ generators. This hapter provides an overview of the major works in the eld of Internet traÆ generation, onsidering rst pa ket-level traÆ generation and then sour e-level traÆ generation. Other aspe ts of traÆ generation, su h as load s aling, in orporating network-dependen ies and implementation issues are dis ussed at the end of the hapter. 2.1 Pa ket-Level TraÆ Generation In this dissertation we restri t the question of generating realisti traÆ to a single link. This is the most essential form of the traÆ generation problem. It does not seem possible to ta kle the problem of generating traÆ for multiple links, say the ba kbone of an ISP, if single-link traÆ generation is not fully understood. The simplest way of generating realisti traÆ on a single link is to inje t pa kets into the network a ording to the hara teristi s of the pa kets observed traversing a real link. We will use the term pa ket-level traÆ generation to refer to this approa h. Pa ket-level traÆ generation an mean either performing a pa ket-level replay , i.e., reprodu ing the exa t arrivals and sizes of every observed pa ket, or inje ting pa kets in su h a manner as to preserve some set of statisti al properties onsidered fundamental, or relevant for a spe i experiment. Pa ketlevel replay, whi h has been implemented in tools like t preplay [t pb℄, is a straightforward te hnique that is useful for ertain types of experiments where on guration of the network is not expe ted to a e t the generated traÆ . In other words, whenever it is reasonable to generate traÆ that is invariant of (i.e., unresponsive to) the experimental onditions, then pa ket-level replay is an e e tive means for generating syntheti traÆ . For example, pa ket-level replays of tra es olle ted from the Internet have been used to evaluate a he repla ement poli ies in routing tables [Jai90, Fel88, G C02℄. In this type of experiments, di erent a he repla ement 23 poli ies are ompared by feeding the lookup a he of a routing engine with a pa ket tra e and omputing the a hieved hit ratio. Also, studies that require mali ious traÆ generation an often make use of pa ket-level replay [SYB04, RDFS04℄. Mali ious traÆ (e.g., a SYN ood) is frequently not responsive to network onditions (and their degradation). Before ondu ting an experiment in whi h traÆ is generated using pa ket-level replay, resear hers must obtain one or more tra es of the arrivals of pa kets to a network link. These tra es are olle ted using a pa ket \sni er" to monitor the traÆ traversing some given link. This pa ket apturing an be performed with and without hardware support. The most prominent example of software-only apture is the Berkeley Pa ket Filter (BPF) system [MJ93, t pa℄. BPF in ludes a pa ket apturing library, libp ap, and a ommand-line interfa e and tra e analysis tool, t pdump. BPF relies on the promis uous mode of network interfa es to observe pa kets traversing a network link and to reate a tra e of them in the \p ap" format. Due to priva y and size onsiderations, most tra es only in lude the proto ol headers (IP and TCP/UDP) of ea h pa ket and a timestamp of the pa ket's arrival. Monitoring high-speed links with a softwareonly system is problemati , given that traÆ has to be forwarded from the network interfa e to the monitoring software using the system bus. The system bus may not be fast enough for this task depending on the load on the monitored link. High loads an result in \dropped" pa kets that are absent from the olle ted tra e. Furthermore, the extra forwarding from the wire to the monitoring program, whi h usually involves bu ering in the network interfa e and in operating system layers, makes timestamps rather ina urate. In the ase of BPF, timestamping ina ura ies of a few hundreds of mi rose onds are quite ommon. In order to over ome these diÆ ulties, resear hers often make use of spe ialized hardware that an extra t headers and provide timestamps without the intervention of the operating system. This is of ourse far more expensive, but it dramati ally improves timestamp a ura y and in reases the volume of traÆ that an be olle ted without drops. The DAG platform [Pro, GMP97, MDG01℄ is a good example of this approa h, and it is widely used in network measurement proje ts. The timestamping a ura y of DAG tra es is on the order of nanose onds. Multiple DAG ards, possibly at di erent lo ations, an also be syn hronized using an external lo k signal, su h 24 as the one from the Global Positioning System (GPS). Besides olle ting their own tra es, resear hers an also make use of publi repositories of p ap and DAG tra es, su h as the Internet TraÆ Ar hive [Int℄ and the PMA proje t at NLANR [nlab℄. While pa ket-level replay is on eptually simple, it involves a number of engineering hallenges. First, traÆ generators usually rely on operating systems layers and abstra tions, su h as raw so kets, to perform the pa ket-level replay. Most operating systems provide no guarantee on the exa t delay between the time of pa ket inje tion by the traÆ generator and the time at whi h the pa ket leaves the network interfa e. Servi ing interrupts, s heduling pro esses, et ., an introdu e arbitrary delays, whi h make the arrival pro ess of the pa ket replay di er from the original and intended arrival pro ess. This ina ura y may or may not be signi ant for a given experiment. Another hallenge is the replay of tra es olle ted in high-speed links. The rate of pa ket arrivals in a tra e an be far higher than the rate at whi h a single host an generate pa kets. For example, the speed at whi h a ommodity PC an inje t pa kets into the network is primarily limited by the speed of its bus and the bandwidth of its network interfa e. As a onsequen e, replying a high rate tra e often requires an experimenter to partition the tra e into subtra es that have to be replayed using a olle tion of hosts. In this ase, it is important to arefully syn hronize the replay of these hosts. This is generally a diÆ ult task, sin e the syn hronization has to be done using the network itself, whi h introdu es variable I/O delays. Clo k drift is also a on ern with ommon PC lo ks. Ye et al. [YVIB05℄ dis ussed pa ket-level replay of high rate tra es, fo using on OC-48, and how to evaluate the a ura y of the replay. They proposed ow-based splitting to onstru t a partition of the original tra e that an be a urately replayed by an ensemble of traÆ generators. This addresses the hallenge of replaying a tra e using multiple traÆ generators without reordering the pa kets within a ow. In ontrast, round-robin assignment of pa kets to traÆ generators, alled hoi e of N in this work, results in pa kets belonging to the same ow generated by di erent traÆ generators. As a onsequen e, the generated traÆ exhibits substantial pa ket reordering. This reordering is due to the diÆ ulty of maintaining the generators perfe tly syn hronized with ommodity hardware, so one generator an easily get ahead of another 25 and modify the order of pa kets within a ow. Ye et al. also dis ussed the diÆ ulties reated by bu ering on the network ards, whi h modi es the properties of the pa ket arrival pro ess at ne s ales. An alternative to the approa h in Ye et al. is to rely on spe ialized hardware. Most DAG ards support pa ket-level replay, bypassing the network sta k. However, no information is available on how a urately the generated traÆ preserves the properties of original pa ket arrival pro ess. Pa ket-level replay has two important short omings: it is in exible and it is open-loop. Given that a pa ket-level replay is the exa t reprodu tion of a olle ted tra e, both in terms of pa ket arrival times and pa ket ontent, there is no way to introdu e variability in the experiments other than a quiring a olle tion of tra es and using a di erent tra e in di erent runs of the experiments. This makes pa ket replay in exible, sin e the resear her has to limit his experiments to the available tra es and their hara teristi s. The \right" tra es may not be available or may be diÆ ult to olle t. Even ondu ting experiments that study simple questions an be umbersome. For example, a resear her that intends to test a a he repla ement poli y under heavy loads must nd tra es with high pa ket arrival rates, whi h may or may not be available. Similarly, evaluating a queuing me hanism under a range of (open-loop) loads requires one to nd tra es overing this range of loads, and may involve mixing tra es from di erent lo ations, whi h ould ast doubt on the realism of the resulting traÆ and thus on the on lusions of the evaluation. More exible traÆ generation an be a hieved by generating pa kets a ording to a set of statisti al properties derived from real measurements. The hallenge then is to determine whi h properties of traÆ are most important to reprodu e so that the syntheti generated traÆ makes the experiments \realisti enough." For example, Internet traÆ has been found to be very bursty, showing very frequent hanges in throughput (both for pa kets and bytes per unit of time). Therefore, most experiments should make use of syntheti traÆ that preserves this observed burstiness. Leland et al. [LTWW93℄ observed that this burstiness an be studied using the framework provided by statisti al self-similarity . At a high-level, self-similarity means that traÆ is equally bursty, i.e., equal varian e in arrival times, a ross a wide range of time 26 s ales. This is similar to the geometri self-similarity that fra tals exhibit. Mathemati ally, statisti al self-similarity manifests itself as long-range dependen e, a sub-exponential de ay of the auto orrelation of a time-series with s ale. This is in sharp ontrast to Poisson modeling and its short-range dependen e, whi h implies an exponential de ay of the auto orrelation with s ale. Therefore, it is generally diÆ ult to a ept experimental results where syntheti traÆ does not exhibit some degree of self-similarity. A ordingly, some experiments may simply rely on some method for generating a self-similar pro ess [Pax97℄ and inje t pa kets into the experiments a ording to this pro ess. Studies on queuing dynami s, e.g., [ENW96℄, made use of this traÆ generation approa h. Other experiments with a more stringent need for realism may also attempt to reprodu e other known properties of traÆ . For example, a realisti distribution of IP addresses is essential for experiments in whi h route a hing performan e is evaluated. To a omplish this, pa ketlevel traÆ generation an be ombined with a statisti al model of pa ket arrival and a model of address stru ture. As one example, Aida and Abe [AA01℄ proposed a generative model based on the nding that the popularity of addresses follows a powerlaw (a heavy-tailed distribution with a hyperboli shape). In ontrast, Kohler et al. [KLPS02℄ fo used on the hierar hi al stru ture of addresses and pre xes, whi h is shown to be well-des ribed by a multi-fra tal model. Both studies ould be used to enri h pa ket-level traÆ generation. 2.2 Sour e-Level TraÆ Generation While pa ket-level traÆ generation based on a set of statisti al properties is onvenient for the experimenter, and attra tive from a mathemati al point of view, it fails to preserve an essential property of Internet traÆ . As Floyd and Paxson [PF95℄ point out, pa ket-level traÆ generation is open-loop, in the sense that it does not preserve the feedba k loop that exists between the sour es of the traÆ (the endpoints) and the network. This feedba k loop omes from the fa t that endpoints rea t to network onditions, and this rea tion itself an hange these onditions, and therefore trigger further hanges in the behavior of the endpoints. For 27 example, TCP traÆ rea ts to ongestion by lowering its sending rate, whi h in turn de reases ongestion. A tra e of pa ket arrivals olle ted at some given link is therefore spe i to the hara teristi s of this link, the time of the tra ing paths of the onne tions that traversed it, et . Therefore, any hanges that the experimenter makes to the experimental onditions make the pa ket-level traÆ invalid sin e the traÆ generation pro ess is insensitive to these hanges (unlike real Internet traÆ ). For example, pa ket-level replay of TCP traÆ does not rea t to ongestion in any manner. The solution is to model the sour es of traÆ , i.e., to model the network behavior of the appli ations running on the endpoints that ommuni ate using network ows. Sour e-level models are then used to drive network sta ks whi h do implement ow and ongestion ontrol me hanisms, and therefore rea t to hanges in network onditions as real Internet endpoints do. As a result, the generated traÆ is losed-loop, whi h is far more realisti for a wide range of experiments. The simplest sour e-level model is the in nite sour e model . The starting point of the in nite sour e model is the availability of an in nite amount of data to be ommuni ated from one endpoint to another. Generating traÆ a ording to this model means that a traÆ generator opens one or more transport onne tions, and onstantly provides them with data to be transferred. This means that, for ea h onne tion, one of the endpoints is onstantly writing (sending data pa kets) while the other endpoint is onstantly reading (re eiving data pa kets). The sour es are never the bottlene k in this model. The only pro ess that limits the rate at whi h the endpoints transmit data is the network, broadly de ned to in lude any me hanism below the sour es, su h as TCP's maximum re eiver window. The in nite sour e model is very attra tive for several reasons, whi h make it rather popular in both theoreti al and experimental studies [FJ93b, KHR02, AKM04, SBDR05℄. First, the in nite sour e model has no parameters and hen e it is easy to understand and amenable to formal analysis. It was, for example, the foundation for the work on the mathemati al analysis of steady-state TCP throughput [PFTK98, BHCKS04℄. Se ond, its underlying assumption is that the largest ows on the network, whi h a ount for the majority of the pa kets and 28 the bytes, \look like" in nite sour es. For example, an in nite sour e provides a onvenient approximation to a multi-gigabyte le download using FTP. Third, in nite sour es are wellbehaved, in the sense that, if driving TCP onne tions, they try to onsume as mu h bandwidth as possible. They also result in the ideal ase for bandwidth sharing. This makes them useful for experiments in the area of ongestion ontrol, sin e in nite sour es an easily ongest network links. Despite their onvenien e, in nite sour es are unrealisti and do not provide a solid foundation for networking experiments, or even for understanding the behavior and performan e of the Internet. The pioneering work by C a eres et al. [CDJM91℄, published as early as 1991, provided a rst insight into the substantial di eren e between in nite sour es and real appli ation traÆ . These authors examined pa ket header tra es from three sites (the University of California at Berkeley, the University of Southern California, and Bell ore in New Jersey) using the on ept of appli ation-level onversations. An appli ation-level onversation was de ned as the set of pa kets ex hanged between two network endpoints. These onversations ould in lude one or more \asso iations" (TCP onne tions and UDP streams). A general problem when studying traÆ for extended periods is the need to separate traÆ into independent units of a tivity, whi h in this ase orrespond to onversations. Endpoints may ex hange traÆ regularly, say every day, but that does not mean that they are engaged in the same onversation for days. Danzig et al. separated onversations between the same endpoints by identifying long periods without any traÆ ex hange, whi h are generally referred to as idle times or quiet times in the literature. In their study, they used a threshold of 20 minutes to di erentiate between two onversations. The authors examined onversations from 13 di erent appli ations, hara terizing them with the help of empiri al umulative distribution fun tions (empiri al CDFs). The results in lude empiri al CDFs for the number of bytes in ea h onversation, the dire tionality of the ow of data (i.e., whether the two endpoints sent a similar amount of data), the distribution of pa ket sizes, the popularity of di erent networks, et . Danzig and Jamin [DJ91℄ used these distributions in their traÆ generation tool, t plib. The results from this work are further dis ussed in Se tion 2.2.2. 29 C a eres et al. pointed out a number of substantial di eren es between their results and the assumptions of earlier works. First, the majority of onne tions arried very small amounts of data, less than 10 KB in 75-90% of the ases. This is true for both intera tive appli ations (e.g., telnet and rlogin) and bulk transfer appli ations (e.g., FTP, SMTP). This is in sharp ontrast to the in nite availability of data to be transferred assumed in the in nite sour e model. The dynami s of su h short data transfers are ompletely di erent from those of in nite sour es, whi h for example have time to fully employ ongestion ontrol me hanisms. The se ond di eren e was that traÆ from most appli ations was shown to be strongly bidire tional, and it in luded at least one request/response phase, i.e., an alteration in the role of the endpoints as senders of data. The in nite sour e model is inherently unidire tional, with one of the endpoints always a ting as the sender, and the other endpoint always a ting as the re eiver. Third, the authors observed a wide range of pa ket sizes, and a large fra tion of the data pa kets were small, even for bulk appli ations. Data pa kets from an in nite sour e are ne essarily full size, sin e there is by de nition enough data to ompletely ll new pa kets. These measurement results highlighted a substantial di eren e between in nite sour es and real traÆ , and later experimental studies demonstrated the perils of using traÆ from in nite sour es in the evaluating of network me hanisms. Joo et al. [JRF+99, JRF+01℄ demonstrated that in nite TCP sour es tend to be ome syn hronized, so they in rease or de rease their transmission rate at the same time. This pattern is ompletely absent from more realisti experiments in whi h the majority of the sour es have small and diverse amounts of data to send. As a result, loss patterns, queue lengths and other hara teristi s are strikingly di erent when more realisti syntheti traÆ is used. Joo et al. also studied the di eren e between open-loop and losed-loop traÆ generation. The area of a tive queue management has provided several illustrations of the misleading results obtained with the unrealisti in nite sour es. The rst AQM s heme, RED, was presented by Floyd and Ja obson in [FJ93b℄, and evaluated using in nite sour es. Their results showed that RED signi antly outperformed FIFO, the usual router queuing me hanism. Later work by Christiansen et al. [CJOS00℄ demonstrated that RED o ers very little bene t, if any, 30 when exposed to more realisti traÆ where sour es are not in nite. In parti ular, they used a model of web-like traÆ , whi h is dis ussed later in this hapter. Paxson's analysis [Pax94℄ of pa ket header tra es from seven di erent network links provided further support for the on lusions of C a eres et al. In addition, Paxson onsidered the parsimonious modeling of traÆ from di erent appli ations. He hara terized four prominent appli ations, telnet, NNTP, SMTP and FTP, using analyti models to t the empiri al distributions. Analyti models are more ommonly known as parametri models in the statisti al literature, and orrespond to lassi al distributions, su h as the Pareto distribution, that an be fully hara terized with a mathemati al expression and only one or a few parameters. As Paxson pointed out, the use of analyti models results in a on ise des ription of network appli ations that an be easily ommuni ated and ompared, and are often mathemati ally tra table. His methodology has had a lasting in uen e in appli ation-level modeling. He learly demonstrated that analyti ts (i.e., parametri models) of the observed distributions an losely approximate the hara teristi s of real appli ations. However, it is important to remember that traÆ is not ne essarily more realisti when generated by analyti models as opposed to empiri al models. Empiri al CDFs, derived from network measurement of suÆ ient size, provide a perfe tly valid foundation for traÆ generators. Furthermore, nding analyti ts of omplex random variables that do not mat h well-known statisti al distributions is a daunting task. 2.2.1 Web TraÆ Modeling Modeling web traÆ has re eived substantial attention sin e the sudden emergen e of the World Wide Web in the mid-nineties. Arlitt and Williamson [AW95℄ proposed an early model for generating web traÆ 1, based on pa ket header tra es olle ted at the University of Saskat hewan. The model was entered around the on ept of a onversation, as proposed by C a eres et al. [CDJM91℄. In this ase, a onversation was the set of onne tions observed between a web browser and a web server. These authors were the rst to onsider questions 1To be more spe i , Arlitt and Williamson proposed a model of \Mosai " traÆ . Mosai was the rst web browser. 31 su h as the distribution of the number of bytes in requests and responses, the arrival rates of onne tions, et . In general, the proposed model has parameters that are quite di erent from those of later works. For example, an Erlang model of response sizes was used, whi h is in sharp ontrast to the heavy-tailness observed by other authors. While Arlitt and Williamson did not provide any details on the statisti al methods they employed, it is likely that the small sample size (less than 10,000 TCP onne tions) made it diÆ ult to develop a more statisti ally representative model. One of the major e orts in the area of web traÆ modeling oriented toward traÆ generation took pla e at Boston University. Cunha et al. [CBC95℄ examined lient tra es olle ted by instrumenting browsers at the Department of Computer S ien e. Unlike the pa ket header tra es used in Arlitt and Williamson, lient tra es in lude appli ation information su h as the exa t URL of ea h web obje t requested and downloaded in ea h TCP onne tion. The authors made use of this information to study page and server popularity, whi h are relevant for web a hing studies. In addition, the authors proposed the use of powerlaws for onstru ting a parametri model of web traÆ . They relied on the Pareto distribution for modeling the sizes of downloaded obje ts, and the parameterless Zipf's law for modeling the popularity of spe i pages. Crovella and Bestavros [CB96℄ used these ndings to explain the long-range dependen e observed in the pa ket arrivals of web traÆ . Their explanation was derived from earlier work by Willinger et al. [WTSW97℄, whi h showed that the multiplexing of heavy-tailed ON/OFF sour es results in long-range dependent traÆ . Crovella and Bestavros demonstrated that the underlying distributions of web obje t sizes, the e e ts of a hing and user preferen e in le transferring, the e e t of user \think time", and the superimposition of many web transfers pre isely reates the multiplexing pro ess hypothesized by Willinger et al. Crovella and Bestavros also showed that the explanation behind the suitability of powerlaws for des ribing the sizes of web obje ts is that the sizes of les are well des ribed by powerlaws. This re ned previous studies of le-system hara teristi s (e.g., [BHK+91℄), whi h observed long-tailed distributions of le sizes (but did not propose powerlaw models). Powerlaw modeling has had a lasting impa t on traÆ modeling, whi h is natural given 32 that the transfer of les is one of the most ommon uses of many appli ation proto ols. Countless studies have on rmed the usefulness of powerlaws for modeling appli ation traÆ . The eloquent term \mi e and elephants" [GM01, MHCS02, EV03℄, often applied to Internet traÆ , pre isely refers to the basi hara teristi of powerlaws: a majority of values are small (mi e) but the un ommon large values (elephants) are so large that they a ount for a large fra tion of the total value. For example, web traÆ usually shows around 90% of web obje ts below 10 KB, but larger obje ts often a ount for 90% of the total bytes. Resear hers have used this general nding of powerlaw sizes to develop a generi , and mostly ad ho , sour e-level model. TraÆ generated a ording to this model onsists of a olle tion of TCP onne tions that transfer a single le, su h that the distribution of le sizes follows a powerlaw. Resear hers often refer to this kind of syntheti traÆ as \mi e-and-elephants-like" or \web-like" traÆ [MGT00, KHR02℄. This simple approa h is rather onvenient for traÆ generation, but it ignores the more omplex patterns of onne tion usage (e.g., bidire tionality, quiet times, et .), and the di eren es among appli ations present in real Internet traÆ . It is important to note that re ent work on the hara terization of web traÆ has improved our understanding of powerlaw/heavy-tailed modeling. Downey revisited the modeling of le sizes in [Dow01b℄ and of ow sizes in [Dow01a℄, suggesting that lognormal distributions are more appropriate than powerlaws (or heavy-tailed distributions). The histori al survey by Mitzenma her [Mit04℄ un overed similar ontroversies in other elds, su h as e onomi s and biology. Hern andez-Campos et al. demonstrated that lognormal distributions and powerlaws o er similar results in the regions of the distribution for whi h enough samples are available, spe i ally in the body and in the \moderate" tail. Beyond these regions, in the \far" tail, the la k of samples makes it impossible to hoose between di erent models. This is be ause, for a xed set of parameters and a xed sample size equal to the original number of observations, some samplings of the lognormal and the powerlaw models mat h the original distribution, while other samplings do not. Hern andez-Campos et al. also proposed the use of a mixture model (i.e., a ombination of several lassi al models), the double Pareto lognormal, whi h enables far more a urate ts than those a hieved with Pareto or lognormal models. The inherently 33 more exible double Pareto lognormal model an apture the systemati deviations from simpler models that are ommonly observed in the tails of the distributions of web obje t sizes. Nuzman et al. [NSSW02℄ modeled HTTP onne tion arrivals using the biPareto distribution, whi h provides a simpler but powerful alternative to mixture models. A Pareto distribution appears linear in a log-log s ale, while the biPareto distribution shows two linear regions and a smooth transition between them. The biPareto distribution is therefore a generalization of the Pareto distribution. The modeling e orts at Boston University ulminated with the development of the SURGE model of web traÆ [BC98℄. The SURGE model des ribed the behavior of ea h user as a sequen e of web page downloads and think times between them. Ea h web page download onsisted of one or more web obje ts downloaded from the same server. Barford and Crovella provided parametri ts for ea h of the omponents of the SURGE model, heavily relying on powerlaws and other long-tailed distributions. They also studied how SURGE traÆ stressed web servers, and found SURGE's high burstiness far more demanding in terms of server CPU performan e than that of less elaborate web traÆ generators, su h as the ommer ial WebStone. A model of web traÆ ontemporary to SURGE was also presented by Mah [Mah97℄. It des ribed web traÆ using empiri al CDFs, whi h were derived from the analysis of pa ket header tra es. As in the ase of the SURGE model, the data ame from the population of users in a omputer s ien e department. The two models were ompared by Hern andez-Campos et al. [HCJS03℄, showing substantial onsisten y. The introdu tion of persistent onne tions in HTTP motivated further work on web traÆ modeling. Barford et al. studied the performan e impli ations of persistent onne tions [BC99℄, and modi ed the SURGE model to in orporate persisten y [BBBC99℄. The analysis of persistent onne tions was also a major topi in Smith et al. [SHCJO01℄ and Hern andez-Campos et al. [HCJS03℄. These studies were far larger in s ope, fo using on the web traÆ of an entire university rather than of a single department. These latter two works provided the starting point for the analysis method presented in this dissertation. 34 Many experimental studies made use of syntheti traÆ generated a ording to one of the aforementioned web traÆ models. For example, Christiansen et al. [CJOS00℄ made use of the Mah model, while Le et al. [LAJS03℄ used the Smith et al. model. The popular NS-2 [BEF+00℄ network simulator also supports web traÆ generation using models that are stru turally similar to the SURGE model. This feature of NS was used in Joo et al. [JRF+99, JRF+01℄ to ompare web traÆ and in nite sour es, and by Feldmann et al. [FGHW99℄ to study the impa t of di erent parameters of the web traÆ model on the burstiness of the generated traÆ . Another web traÆ generator available in NS-2 was developed by Cao et al. [CCG+04℄. Unlike other web traÆ models, it was onne tion-oriented rather than user-oriented, and in luded nonsour e-level hara teristi s, su h as pa ket sizes. An important e ort in web traÆ analysis and generation was \Monkey See, Monkey Do" method, developed by Cheng et al. [CHC+04a℄. The method involved re ording sour e-level and network-level hara teristi s for ea h observed onne tion, and reprodu ing these hara teristi s using a syntheti workload generator. This idea is similar to the one developed in this dissertation, although we ta kle the modeling and generation of entire traÆ mixes and not just web traÆ . In addition, their measurement methods were optimized for monitoring traÆ near Google's web servers. The authors assumed independent short ows, data a quisition lose to well-provisioned web servers, and no ongestion in the lient-to-server dire tion (whi h was plausible in the ontext of requests that were far smaller than responses). 2.2.2 Non-Web TraÆ Sour e-level Modeling Two prominent sour e-level modeling e orts took pla e before the invention of the World Wide Web. Danzig and Jamin [DJ91℄ developed t plib, a olle tion of sour e-level des riptions of traÆ . It in luded des riptions of the following appli ations: Telnet was des ribed using three random variables: onne tion duration, pa ket interarrival time, and pa ket size. The initiator of the Telnet onne tion always sent one-byte pa kets, while the a eptor responded with pa kets mat hing the pa ket size distribution. 35 The authors laimed that rlogin onne tions were also well-des ribed by this model. File Transfer Proto ol (FTP) was des ribed using three random variables: number of items transferred, item size (i.e., le size), and pa ket size. The model only des ribed FTP-DATA transa tions used to transfer a single le or a dire tory listing. It did not des ribe the FTP-Control onne tion that ea h lient/server pair must use to manage ea h FTP-DATA transa tion. Simple Mail Transfer Proto ol (SMTP) was des ribed using only one random variable: item size, whi h in luded size of mail message and address veri ation (i.e., ontrol) messages. Responses from the a eptor were onsidered negligible, and not modeled. Network News Transfer Proto ol (NNTP) was des ribed using two random variables: number of items transferred, and size of items (i.e., NNTP arti les). The bidire tional nature of the proto ol and the use of ontrol messages was not part of the model. T plib also in luded a model of phone onversations with two random variables, talk spurt duration and quiet time (i.e., pause) duration, borrowed from [Bra65℄. Ea h random variable was spe i ed using an empiri al CDF. TraÆ generation involved using the inverse transformation method [Jai91℄ to sample ea h empiri al CDF independently. In general, the appli ation models in t plib were rather simplisti , but they represented a giant step forward from the non-measurement-derived models of the early 90s. However, the use and apabilities of the modeled appli ations has dramati ally hanged sin e the development of t plib. For example, the size of atta hments in SMTP onne tions has dramati ally in reased due to the widespread implementation of Multipurpose Internet Mail Extensions (MIME). In addition, newer appli ations have be ome prominent or repla ed the ones in t plib. For example, the Telnet proto ol has been mostly repla ed by the Se ure Shell (SSH) proto ol. SSH is an en rypted proto ol, so it requires more bytes per message. It also supports port forwarding, wherein other appli ations an ommuni ate through SSH onne tions. 36 Paxson [Pax94℄ studied the same four appli ations as in t plib, developing parametri models for ea h of them. Paxson also dis ussed how appli ation hara teristi s hange over time and a ross sites. This inherent variability motivated the use of parametri models, whi h are ne essarily approximations of the empiri al data. This approximation is not worse than the variability observed over time and a ross sites, so the author argued that parametri models were as a urate as empiri al ones, but with the added bene ts of being mathemati ally tra table and parsimonious. His analysis showed that bulk-transfer sizes were generally well-modeled by the log-normal distribution. Another of his ndings was that onne tion inter-arrivals (ex ept those of NNTP onne tions) were onsistent with non-homogeneous Poisson arrivals, with xed hourly rates. The methodologi al ontribution in Paxson's work is substantial. He demonstrated the difulty of providing statisti ally valid parametri models of the distributions asso iated with Internet traÆ . He onsistently observed parametri ts that were learly adequate when examined graphi ally, but that failed traditional goodness-oft tests. This was aused by the massive sample sizes, an endemi hara teristi s of traÆ measurement datasets. As an alternative to the statisti al tests, Paxson proposed the use of a goodness-oft metri , whi h provides a quantitative assessment of the distan e between the empiri al data and the parametri model. His proposed metri is however insensitive to deviations in the tails, asting doubt on the approa h due to the ubiquitous nding of heavy-tailed phenomena in network traÆ . Web traÆ qui kly dominated most traÆ mixes after its emergen e in 1995, and remained the most prominent traÆ type until le-sharing appli ations surpassed it in re ent years. This motivated a large body of work on web traÆ hara terization, and little attention was paid to other traÆ . The models developed by Danzig, Jamin and Paxson, were not improved or updated by other resear hers. File-sharing appli ations urrently rival or frequently surpass web traÆ in terms of trafvolume. They also represent a harder modeling problem than web traÆ . The number of le-sharing appli ations is large and they use widely di erent ommuni ation strategies. Fur37 thermore, the set of popular le-sharing appli ations is onstantly hanging. There is a growing body of traÆ modeling literature fo using on le-sharing appli ations, but no traÆ generator is yet available. Two prominent modeling studies were ondu ted at the University of Washington. Sariou et al. [SGG02℄ studied Napster and Gnutella traÆ , and Gummadi et al. [GDS+03℄ studied Kazaa traÆ . Karagiannis et al. [KBBk 03℄ examined a larger set of le-sharing appli ations in ba kbone links. Modeling of multimedia traÆ has also re eived some attention. Variable bit-rate video was studied in Garret et al. and Knightly et al. [GW94, KZ97℄. Real Audio traÆ was studied by Mena and Heidemann [MH00℄, providing a rst sour e-level view of streaming-media, mostly on UDP ows. There are ommer ial syntheti traÆ generation produ ts su h as Chariot [In ℄ and IXIA but these generators are typi ally based on a limited number of appli ation sour e types. Moreover, it is not lear that any are based on empiri al measurements of Internet traÆ . 2.2.3 Beyond Single Appli ation Modeling The need for more representative traÆ generation has motivated resear h on methods that an ta kle the modeling of the entire suite of appli ations using an Internet link. The work in this dissertation lies in this area. Our preliminary steps were an extension of the methods used to model web traÆ in Smith et al. [SHCJO01℄ to model other appli ations, as des ribed in Hern andez-Campos et al. [HCJS+01℄. The same kind of analysis of TCP header sequen e numbers, a knowledgment numbers and onne tion quiet times applied to web traÆ was used to populate models of SMTP and NNTP traÆ . These models were derived from pa ket header tra es olle ted at the University of North Carolina at Chapel Hill, and onsisted of empiri al distributions apturing di erent sour e-level hara teristi s of these proto ols, su h as obje t sizes. Lan and Heidemann [LH02℄ ondu ted a related e ort, reusing the same te hniques and software tools for data a quisition. Their RAMP tool populated models of web and FTP traÆ dire tly from pa ket header tra es, and generate traÆ a ordingly. 38 Harpoon [SB04℄ also ta kled the same problem that is the fo us of this dissertation. They onsidered the problem of analyzing entire traÆ mixes and generating traÆ a ordingly. Their measurement methods were far less elaborate. Rather than the detailed models of the ADU ex hange in TCP onne tions used in our work, Harpoon fo used on modeling ows. Flows are de ned as sets of pa kets with the same sour e and the same destination. As a onsequen e, Harpoon modeled ea h TCP onne tion as two unidire tional ows. Another di eren e with our approa h is that Harpoon did not in orporate the notion of bidire tional data ex hange, neither sequential nor on urrent, essentially treating multiple ADUs (as de ned in the a-b-t model) as a single ADU. Idle times within onne tions were not part of the Harpoon traÆ model either. In addition, any measured ow (i.e., one side of a onne tion) with only a small amount of data or with only a knowledgment pa kets was not used for traÆ generation. This substantially simpli ed the modeling, but it eliminated the ri h pa ket-level dynami s observed in TCP onne tions, and demonstrated in later hapters of this dissertation. In addition to this, network-level parameters were not part of the data a quisition, so round-trip times and maximum re eiver window sizes were arbitrarily hosen. Harpoon ould also generate UDP traÆ . The underlying model was to send pa kets at a onstant bit rate, with either xed or exponentially distributed interval arrivals. These models were not populated frommeasurement. Another novel feature of Harpoon was the ability to generate traÆ that reprodu ed IP address stru ture a ording to a measured distribution of address frequen y. Their study in luded a omparison between Harpoon's losed-loop traÆ and traÆ from a ommer ial (open-loop) pa ket-level traÆ generator, demonstrating substantial di eren es. For example, losed-loop sour es were shown to ba k o as ongestion in reases, while open-loop sour e did not. Like the work in this dissertation and Lan and Heidemann, Harpoon provided an automated method to a quire data and use it to generate traÆ , whi h Sommers and Barford eloquently alled \selftuning" traÆ generation. We ould say that there is a growing onsensus in the eld of traÆ generation regarding the need to develop tools that ombine measurement and generation to ta kle the wide variability over time and a ross links found in real Internet traÆ . 39 2.3 S aling O ered Load One of the key requirements of traÆ generation is the ability to s ale the o ered load , i.e., to generate a wide range of link loads with the same model of appli ation behavior. This makes it possible to evaluate the performan e of a network me hanism under various loads, whi h translates into di erent degrees of ongestion, while preserving the same appli ation mix. For example, the evaluation of AQM me hanism in [CJOS00, LAJS03℄ ompared the performan e of FIFO to RED and other AQM me hanisms for loads between 50% to 110% of a link's apa ity where the queuing me hanism was used. In these studies, the authors pre eded their study by a set of alibration experiments. These experiments were used to derive an expression for the linear dependen y between the number of (web) user equivalents and the average o ered load, whi h enabled the resear hers to systemati ally s ale o ered loads in their evaluation experiments. Calibration is generally appli able to any appli ation-level model. When alibrating, the resear hers try to relate one or more parameters of the model and the average o ered load to obtain a alibration fun tion. Deriving a alibration fun tion is a timeonsuming pro ess, sin e an entire olle tion of experiments must be run to orrelate o ered load and model parameters with on den e. Kamath et al. [K LH+02℄ studied load s aling methods, but they on entrated only on s aling up the o ered load. Their intention was to ondu t experiments with mu h higher o ered loads than those observed during measurement. In parti ular, they onsidered the problem of generating traÆ for loading a 1 Gbps link using only measurements from a 10 Mbps link, an 11-hour pa ket header tra e. The authors onsidered three di erent te hniques. The rst two te hniques involved a transformation of the original tra e into a s aled-up version, and then a pa ket-level replay. The rst transformation te hnique was pa ket arrival s aling, whi h s ales up the load by multiplying the arrival time of ea h pa ket in the original tra e by a onstant fa tor between 0 and 1 (i.e.,,shrinking pa ket inter-arrivals). In their study, they used a s aling fa tor of 0.001. The se ond transformation te hnique is tra e merging, whi h s ales up load by merging, i.e., superimposing, the pa ket arrivals from more than one tra e. They divided the 11-hour tra e into 100 subtra es and then ombined them to form a shorter, higher-throughput 40 tra e. The third te hnique is stru tural modeling whi h meant to develop a web traÆ model from the original tra e using the methods in Smith et al. [SHCJO01℄. The authors did not dis uss how the load reated by this stru tural model was in reased. Their analysis ompared a number of distributions from the generated tra es to those from the original tra e. Pa ket arrival s aling was shown to ompletely distort ow durations and destination address diversity. Tra e merging reprodu ed ow and pa ket arrival properties a urately, but it distorted destination address hara teristi s (studied using the number of unique addresses observed per unit of time). Web traÆ generation was a urate, but it showed far less omplex distributions of onne tion bytes, pa ket sizes, and onne tion durations. This is be ause a stru tural model based only on web traÆ la ks the diversity of appli ation behavior, and therefore ommuni ation patterns, in the original tra e, whi h in luded traÆ from many di erent appli ations and not just web traÆ . 2.4 Implementing TraÆ Generation Sour e-level traÆ generators for network testbeds (rather than for software simulators) are usually implemented using user-level programs that make use of the so ket interfa e to generate traÆ . This is the ase for t plib [DJ91℄, httperf [MJ98℄, SURGE [BC98℄, and other web traÆ generators [BD99, CJOS00℄. In order to introdu e network-level parameters in testbed experiments, su h as a realisti distribution of round-trip times, it is ne essary to rely on a layer of simulation either in the end hosts or somewhere in the path of the traÆ . For example, Rizzo's dummynet [Riz97℄ makes it possible to apply arbitrary delays, loss rates and bandwidth onstraints on the end systems to spe i network ows or olle tions of network ows (that share a network pre x). The implementation ombines event-driven simulation and pa ket queuing, and sits between the IP and link layers. Dummynet is part of the standard distribution of the FreeBSD operating system. The experiments in this dissertation were performed using an extended version of dummynet that an be ontrolled from the appli ation layer2. 2This is also possible in the original implementation, using one rewall rule for ea h ow, but it does not s ale to the hundreds of simultaneous ows in our experiments. 41 Kamath et al. [K LH+02℄ argue that sour e-level traÆ generation is mu h more demanding in terms of CPU and memory pro essing than pa ket-level replay. While it is indeed true that far more CPU time is needed to simulate endpoint behavior and use network sta ks, memory requirements are a tually far more stringent for pa ket-level replay. This is be ause pa ket header tra es are mu h longer than their sour e-level representations. For example, the approa h in this dissertation onsiders the replay of sour e-level tra es that are roughly 100 times smaller than the pa ket header tra es from whi h they were derived. 2.5 Summary Our review of related work has fo used on the existing literature in network traÆ generation, in luding works relevant for data a quisition and traÆ modeling. Chara terizing network traÆ at the pa ket level provides important insights, su h as the nding of pervasive self-similarity by Willinger et al. [WTSW97℄. However, this approa h does not provide the proper foundation for generating traÆ for most experimental studies. As argued by Floyd and Paxson [PF95℄, pa ket-level traÆ generation breaks the end-to-end feedba k loop in adaptive network proto ols, su h as TCP, resulting in traÆ that does not rea t to the experimental onditions realisti ally. On the ontrary, sour e-level models enable losed-loop traÆ generation, so they are appli able to a wider range of situations. In the past, sour e-level traÆ generation has been asso iated with models of appli ation behavior. Our overview of the state-of-the-art dis ussed several highly in uential works devoted to appli ation-level modeling. C a eres et al. [CDJM91℄ introdu ed empiri al appli ation models to networking resear h. Paxson [Pax94℄ proposed the use of more statisti ally rigorous methods for developing parametri sour e-level models. Crovella et al. [CB96℄ developed a ri h model of web traÆ , and explained self-similarity in terms of sour e-level hara teristi s. Appli ation-level modeling has some important short omings that provide the motivation for this dissertation. Internet traÆ mixes are reated by a large number of distin t appli ations, so single appli ation models are not representative of real traÆ . Furthermore, the omposition 42 of traÆ mixes is onstantly hanging, and even individual appli ations often evolve, modifying the way in whi h they intera t with the network. As a onsequen e, the number of highquality appli ation-level models is small (and insuÆ ient), and these models are hardly ever updated. In this dissertation, we propose a more s alable approa h to sour e-level modeling, where appli ation behavior is des ribed in a generi , but still detailed, manner. Furthermore, our data a quisition methods are eÆ ient and mostly automated, dramati ally redu ing the time to go from measurement to traÆ generation. Our ombination of data a quisition and traÆ generation is most losely related to two ontemporary works. Sommers and Barford [SB04℄ developed the Harpoon approa h for generating traÆ mixes whose hara teristi s are derived from measurements in an algorithmi manner. Their approa h did not in lude any detailed sour e-level modeling of TCP onne tions. They des ribed a onne tion simply as a unidire tional le transfer whose size is equal to the total amount of payload in its pa kets. In ontrast, our primary emphasis is on detailed sour e-level modeling, where we introdu e the a-b-t model and un over the di hotomy between sequential and on urrent data ex hange. Harpoon made use of simpli ed network-level parameters, whi h are set to arbitrary onstants. In our approa h, network-level parameters are arefully measured and in orporated into the traÆ generation. The work by Sommers and Barford onsidered two issues that are not addressed in our own work. First, they proposed a method for generating UDP traÆ . The underlying sour e-level model is however not derived from measurement. Se ond, they reprodu ed the IP address distribution in the replayed tra e. This annot be performed with publi ly available tra es, like ours, sin e they are anonymized. Another work similar to ours is Cheng et al. [CHC+04a℄. The authors presented a method for hara terizing pa ket header tra es of web traÆ and a urately replaying them. Generated traÆ was evaluated by omparing the original tra e with its syntheti version generated in a testbed. We ta kle the same sour e-level tra e replay problem but applied to every appli ation rather than only to web traÆ . Our approa h is more ambitious and ne essarily more abstra t. Our work also onsiders the ommon problems of resampling and s aling traÆ load in networking experiments. In general, s aling o ered load has been performed by ondu ting 43 a preliminary experimental study to relate the parameters of the sour e-level model and the o ered load. For example, Christiansen et al. [CJOS00℄ omputed a alibration fun tion that des ribed o ered load as a fun tion of the number of user equivalents employed in web traÆ generation. We propose an alternative approa h that eliminates the need for preliminary alibration studies. 44 CHAPTER 3 Abstra t Sour e-level Modeling model: (11a) a des ription or analogy used to help visualize something (as an atom) that annot be dire tly observed. | Merrian{Webster English Di tionary Anything that has real and lasting value is always a gift from within. | Franz Kafka (1883{1924) Abstra t sour e-level modeling provides a method to des ribe the workload of a TCP onne tion at the sour e level in a manner than is not tied to the spe i s of individual appli ations. The starting point of this method is the observation that at the transport level, a TCP endpoint is doing nothing more than sending and re eiving data. Ea h appli ation (i.e., web browsing, le sharing, et .) employs its own set of data units for arrying appli ation-level ontrol messages, les, and other information. The a tual meaning of the data is irrelevant to TCP, whi h is only responsible for delivering data in a reliable, ordered, and ongestion-responsive manner. As a onsequen e, we an des ribe the workload of TCP in terms of the demands by upper layers of the proto ol sta k for sending and re eiving Appli ation Data Units (ADUs). This workload hara terization aptures only the sizes of the units of data that TCP is responsible for delivering, and abstra ts away the details of ea h appli ation (e.g., the meaning of its ADUs, the size of the so ket reads and writes, et .). The approa h makes it feasible to model the entire range of TCP workloads, and not just those that derive from a few well-understood appli ations as is the ase today. This provides a way to over ome the inherent s alability problem of appli ation-level modeling. While the work of a TCP endpoint is to send and re eive data units, its lifetime is not only di tated by the time these operations take, but also by quiet times in whi h the TCP onne tion remains idle, waiting for upper layers to make new demands. TCP is only a e ted by the duration of these periods of ina tivity and not by the ause of these quiet times, whi h depends on the dynami s of ea h appli ation (e.g., waiting for user input, pro essing a le, et .). Longer lifetimes have an important impa t, sin e the endpoint resour es needed to handle TCP state must remain reserved for a longer period of time1. Furthermore, the window me hanism in TCP tends to aggregate the data of those ADUs that are sent within a short period of time, redu ing the number of segments that have to travel from sour e to destination. This is only possible when TCP re eives a number of ba k-to-ba k requests to send data. If these requests are separated by signi ant quiet times, no aggregation o urs and the data is sent using at least as many segments as ADUs. We have formalized these ideas into the a-b-t model , whi h des ribes TCP onne tions as sets of ADU ex hanges and quiet times. The term a-b-t is des riptive of the basi building blo ks of this model: a-type ADUs (a's), whi h are sent from the onne tion initiator to the onne tion a eptor, b-type ADUs (b's), whi h ow in the opposite dire tion, and quiet times (t's), during whi h no data segments are ex hanged. We will make use of these terms to des ribe the sour e-level behavior of TCP onne tions throughout this dissertation. The a-b-t model has two di erent avors depending on whether ADU interleaving is sequential or on urrent. The sequential a-b-t model is used for modeling onne tions in whi h only one ADU is being sent from one endpoint to the other at any given point in time. This means that the two endpoints engage in an orderly onversation in whi h one endpoint will not send a new ADU until it has ompletely re eived the previous ADU from the other endpoint. On the ontrary, the on urrent a-b-t model is used for modeling onne tions in whi h both endpoints send and re eive ADUs simultaneously. The a-b-t model not only provides a reasonable des ription of the workload of TCP at the sour e-level, but it is also simple enough to be populated from measurement. Control data 1Similarly, if resour es are allo ated along the onne tion's path, they must be ommitted for a longer period. 46 ontained in TCP headers provide enough information to determine the number and sizes of the ADUs in a TCP onne tion and the durations of the quiet times between these ADUs. This makes it possible to onvert an arbitrary tra e of segment headers into a set of a-b-t onne tion ve tors, in whi h ea h ve tor des ribes one of the TCP onne tions in the tra e. As long as this pro ess is a urate, this approa h provides realisti hara terizations of TCP workloads, in the sense that they an be empiri ally derived from measurements of real Internet links. In this hapter, we des ribe the a-b-t model and its two avors in detail. For ea h avor, we rst dis uss a number of sample onne tions that illustrate the power of the a-b-t model to des ribe TCP onne tions driven by di erent appli ations, and point out some limitations of this approa h. We then present a set of te hniques for analyzing segment headers in order to onstru t a-b-t onne tion ve tors and provide a validation of these te hniques using tra es from syntheti appli ations. We nally examine the hara teristi s of a set of real tra es from the point of view of the a-b-t model, providing a sour e-level view of the workload of TCP. 3.1 The Sequential a-b-t Model 3.1.1 Client/Server Appli ations The a-b-t onne tion ve tor of a sequential TCP onne tion is a sequen e of one or more epo hs. Ea h epo h des ribes the properties of a pair of ADUs ex hanged between the two endpoints. The on ept of an epo h arises from the lient/server stru ture of many distributed systems, in whi h one endpoint a ts as a lient and the other one as a server. The lient sends a request for some servi e (e.g., performing a omputation, retrieving some data, et .) that is followed by a response from the server (e.g., the results of the requested a tion, a status ode, et .). An epo h represents our abstra t hara terization of a request/response ex hange. An epo h is hara terized by the size a of the request and the size b of the response. The HTTP that underlines the World-Wide Web provides a good example of the kinds of TCP workloads reated by lient/server appli ations. Figure 1 shows a simple a-b-t diagram 47 that represents a TCP onne tion between a web browser and a web server, whi h ommuni ate using the HTTP 1.0 appli ation-layer proto ol [BLFF96℄. In this example, the web browser ( lient side) initiates a TCP onne tion to a web server (server side) and sends a request for an obje t (e.g., HTML sour e ode, an image, et .) spe i ed using a Universal Resour e Lo ator (URL). This request onstitutes an ADU of size 341 bytes. The server then responds by sending the requested obje t in an ADU of size 2,555 bytes. The representation in the gure aptures: the sequential order of the ADUs within the TCP onne tion ( rst the HTTP request then the HTTP response { in this ase, order also implies \ ausality"), the dire tion in whi h the ADUs ow (above the time line for the ADU sent from the onne tion initiator to the onne tion a eptor; below the time line for the ADU sent from the onne tion a eptor to the onne tion initiator), and the sizes of the ADUs (using annotations and the lengths of the re tangles, whi h are proportional to the number of bytes). The diagram provides a visualization in the spirit of abstra t sour e-level modeling, sin e it does not in orporate any spe i information about the a tual ontents of the ADUs. The bytes in the rst ADU (HTTP request) represent an HTTP header that in ludes a URL, and the bytes in the se ond ADU (HTTP response) represent an HTTP header (with a su ess ode of 200 OK) followed by the requested obje t (e.g., HTML sour e ode). In this example, the purpose of this parti ular onne tion was well-understood, and that allowed us to assign labels to the ADUs (HTTP request and response) and to the TCP endpoints (web browser and server). In general, when we examine how the ADUs ow in an arbitrary TCP onne tion, we do not have this appli ation-spe i information (or we an only guess it). The same diagram (without the Figure 3.1: An a-b-t diagram representing a typi al ADU ex hange in HTTP version 1.0. 48 ! "#$ %& $ ' ! "#$ %& $ ( ) * ( ) * ) + ) + ) + Figure 3.2: An a-b-t diagram illustrating a persistent HTTP onne tion. HTTP-spe i labels) ould be used to represent di erent onne tions with ompletely di erent payloads in ADUs of the same size. The diagram does not in lude any network-level information either, so this diagram ould also represent onne tions with very di erent maximum segment sizes, round-trip times, and other network properties below the appli ation level. Note that this example, and the following ones, ame from real onne tions that were a tually observed. In some ases, we had a ess to the a tual segment payloads and used them to add annotations to the ADUs. In other ases, we used port numbers and our understanding of the proto ols to add these annotations. Some lient/server appli ations use a new onne tion for ea h request/response ex hange, while other appli ations reuse a onne tion for more than one ex hange, reating onne tions with more than one epo h. As long as the appli ation has enough data to send, multi-epo h onne tions an improve performan e substantially, by avoiding the onne tion establishment delay and TCP's slow start phase. For example, HTTP was revised to support more than one request/response ex hange in the same \persistent" TCP onne tion [FGM+97℄. Figure 3.2 illustrates this type of intera tion. This is a onne tion between a web browser and a web server, in whi h the browser rst requests the sour e ode of an HTML page, and re eives it from the web server, just like in Figure 3.1. However, the use of persistent HTTP makes it possible for the browser to send another request using the same onne tion. Unlike the example in Figure 3.1, this persistent onne tion remains open after the rst obje t is downloaded, so the browser an send another request without rst losing the onne tion and reopening a new one. In Figure 3.2 the web browser sends three ADUs that spe ify three di erent URLs, and 49 the server responds with three ADUs. Ea h ADU ontains an HTTP header that pre edes the a tual requested obje t. If the requested obje t is not available, the ADU may only ontain the HTTP header with an error ode. Note that the diagram has been annotated with extra appli ation-level information showing that the rst two epo hs were the result of requesting obje ts from the same do ument (i.e., same web page), and the last epo h was the result of requesting a di erent do ument. The diagram in Figure 3.2 in ludes two time gaps between epo hs (represented with dashed lines). In both ases, these are quiet times in the intera tion between the two endpoints. We all the time between the end of one epo h and the beginning of the next, the inter-epo h quiet time. The rst quiet time in the a-b-t diagram represents pro essing time in the web browser, whi h parsed the web page it re eived, retrieved some obje ts from the lo al a he, and then made another request for an obje t in the same do ument (that was not in the lo al a he). Be ause of its longer duration, the se ond quiet time is most likely due to the time taken by the user to read the web page, and li k on one of the links, starting another page download from the same web server. As will be dis ussed in Se tion 3.3, it is diÆ ult to distinguish quiet times aused by appli ation dynami s, whi h are relevant for a sour e-level model, and those due to network performan e and hara teristi s, whi h should not be part of a sour e-level model (be ause they are not aused by the behavior of the appli ation). The basi heuristi employed to distinguish between these two ases is the observation that the s ale of network events is hardly ever above a few hundred millise onds2. Going ba k to the example in Figure 3.2, the only quiet time that ould be safely assumed to be due to the appli ation (in this ase, due to the user) is the one between the se ond and third epo hs. The 120 millise onds quiet time between the rst and se ond epo hs ould easily be due to network e e ts (su h as having the sending of the se ond request delayed by Nagle's algorithm [Nag84℄), and therefore should not be part of the sour e-level behavior. Similarly, the two a-b-t diagrams shown so far have not depi ted 2Some infrequent events, su h as routing hanges due to link failures, an last several se onds. We generally model large numbers of TCP onne tions, so the few o asions in whi h we onfuse appli ation quiet times with long network quiet times have no measurable statisti al impa t when generating network traÆ . 50 any time between the request and the response inside the same epo h. In general, web servers pro ess requests so qui kly that there is no need to in orporate intra-epo h quiet times in a model of the workload of a TCP onne tion. While this is by far the most ommon ase, some appli ations do have long intra-epo h quiet times, and the a-b-t model an in lude these. Formally, a sequential a-b-t onne tion ve tor has the form Ci = (e1; e2; : : : ; en) with n 1 epo h tuples. An epo h tuple has the form ej = (aj ; taj; bj ; tbj) where aj is the size of the jth ADU sent from the onne tion initiator to the onne tion a eptor. aj will also be used to name the jth ADU sent from the initiator to the a eptor. bj is the size of the jth ADU sent in the opposite dire tion (and generally in response to the request made by aj). taj is the duration of the quiet time between the arrival of the last segment of aj and the departure of the rst segment of bj . taj is de ned from the point of view of the a eptor (often the server), but ultimately our estimate of the duration is based on the arrival times of segments at some monitoring point. tbj is either the duration of the quiet time between bj and aj+1 (for onne tions with at least j + 1 epo hs), or the quiet time between the last data segment (i.e., last segment with a payload) in the onne tion and the rst ontrol segment used to terminate the onne tion. Note that taj is a quiet time as seen from the a eptor side, while tbj is a quiet time as seen from the initiator side. The idea of these de nitions is to apture the network-independent omponent of quiet times, without being on erned with the spe i measurement method. In a persistent HTTP onne tion, a's would usually be asso iated to HTTP requests, b's to HTTP responses, ta's to pro essing times on the web server, and tb's to browser pro essing times and user think times. We an say that a quiet time taj is \ aused" by an ADU aj , and that a quiet time tbj is aused by an ADU bj. Both time omponents are de ned as quiet times observed at one of the endpoints, and not at some point in the middle of the network where the pa ket 51 header tra ing takes pla e. As mentioned in the introdu tion, the name of the model omes from the three variable names used in this model, whi h are used to apture the essential sour e-level properties: data in the \a" dire tion, data in the \b" dire tion, and time \t" (non-dire tional, but asso iated with the pro essing of the pre eding ADU, as dis ussed in Se tion 3.1.1). Using the notation of the a-b-t model, we an su in tly des ribe the HTTP onne tion in Figure 3.1 as a single-epo h onne tion ve tor of the form ((341; 0; 2555; 0)) where the rst ADU, a1, has a size of 341 bytes, and the se ond ADU, b1, has a size of 2,555 bytes. In this example the time between the transmission of the two data units and the time between the end of b1 and onne tion termination are onsidered too small to be in luded in the sour e level representation, so they are set to 0. Similarly, we an represent the persistent HTTP onne tion shown in Figure 3.2 as ((329; 0; 403; 0:12); (403; 0; 25821; 3:12); (356; 0; 1198; 15:3)) where quiet times are given in se onds. Noti e that tb3 is not zero for this onne tion, but a large number of se onds (in fa t, probably larger than the duration of the rest of the a tivity in the onne tion!). Persistent onne tions are often left open in ase the lient de ides to send a new HTTP request reusing the same TCP onne tion3. As we will show in Se tion 3.5, this separation is frequent enough to justify in orporating it in the model. Gaps between onne tion establishment and the sending of a1 are almost nonexistent. As another example, the Simple Mail Transfer Proto ol (SMTP) onne tion in Figure 3.3 illustrates a sample sequen e of data units ex hanged by two SMTP servers. The rst server (labeled \sender") previously re eived an email from an email lient, and uses the TCP onne tion in the diagram to onta t the destination SMTP server (i.e., the server for the domain 3In general, persistent HTTP onne tions are losed by web servers after a maximum number of request/response ex hanges (epo hs) is rea hed or a maximum quiet time threshold is ex eeded. By default, Apa he, the most popular web server, limits the number of epo hs to 5 and the maximum quiet time to 15 se onds. 52 ! " ! # $ #" $ % & ! '' ! Figure 3.3: An a-b-t diagram illustrating an SMTP onne tion. of the destination email address). In this example, most data units are small and orrespond to appli ation-level (SMTP) ontrol messages (e.g., the host info message, the initial HELO message, et .) rather than appli ation obje ts. The a tual email message of 22,568 bytes was arried in ADU a6. The a-b-t onne tion ve tor for this onne tion is ((0; 0; 93; 0); (32; 0; 191; 0); (77; 0; 59; 0); (75; 0; 38; 0); (6; 0; 50; 0); (22568; 0; 44; 0)): Note that this TCP onne tion illustrates a variation of the lient/server design in whi h the server sends a rst ADU identifying itself without any prior request from the lient. This pattern of ex hange is spe i ed by the SMTP proto ol wherein servers identify themselves to lients right after onne tion establishment. Sin e b1 is not pre eded by any ADU sent from the onne tion initiator to the onne tion a eptor, the ve tor has a1 = 0 (we sometimes refer to this phenomenon as a \half-epo h"). This last example illustrates an important hara teristi of TCP workloads that is often ignored in traÆ generation experiments. TCP onne tions do not simply arry les (and requests for les), but are often driven by more ompli ated intera tions that impa t TCP performan e. An epo h where aj > 0 and bj > 0 requires at least one segment to arry aj from the onne tion initiator to the a eptor, and at least another segment to arry bj in the opposite dire tion. The minimum duration of an epo h is therefore one round-trip time (whi h is pre isely de ned as the time to send a segment from the initiator to the a eptor plus the time to send a segment from the a eptor ba k to the initiator). This means that the number of epo hs imposes a minimum duration and a minimum number of segments for a TCP onne tion. The onne tion in Figure 3.3 needs 4 round-trip times to omplete the \negotiation" that o urs during epo hs 2 to 5, even if the ADUs involved are rather small. The a tual email message in 53 !" #$ % & ' % () *+ , . / 0 $ ( 1 2 3 4 3 5 ) /67+89: ! ; 0 2 2 < = 1 = ) /6*+ , . / ! ; 0 !" # 1 !" #> ? '' ? ) *+ , . / 0 @ !A # 1< B ?C ) *+ , . / 0 D4 E&& 1 % & 3 4 EF& E55? B 1 EF& E55? $ E 3 4 EF& EF&& $ E 1 % & 1 ? G !" # ) *+ , . / 0 < 1 5? ( ? G H 1( $ $ )I /* ! /+ 0 J K LMN OP N Figure 3.4: Three a-b-t diagrams representing three di erent types of NNTP intera tions. ADU b6 is transferred in only 2 round-trip times. This is be ause b6 ts in 16 segments4, and it is sent during TCP's slow start. Thus the rst round-trip time is used to send 6 segments, and the se ond round-trip time is used to send the remaining 10 segments. The duration of this onne tion is therefore dominated by the ontrol messages, and not by the size of the email. In parti ular, this is true despite the fa t that the email message is mu h larger than the ombined size of the ontrol messages. If the appli ation proto ol (i.e., SMTP) were modi ed to somehow arry ontrol messages and the email ontent in ADU a2, then the entire onne tion would last only 4 round-trip times instead of 6, and would require fewer segments. In our experien e, it is ommon to nd onne tions in whi h the number of ontrol messages is orders of magnitude larger than the number of ADUs from les or other dynami ally-generated ontent. Clearly, epo h stru ture has an impa t on the performan e (more pre isely, on the duration) of TCP onne tions and should therefore be modeled a urately. Appli ation proto ols an be rather ompli ated, supporting a wide range of intera tions between the two endpoints. Most of them assume a lient/server model of intera tion and 4This assumes the standard maximum segment size, 1,460 bytes, and a maximum re eiver window of at least 10 full size segments. A large fra tion of TCP onne tions observed on real networks satisfy these assumptions. 54 hen e an be ast into the sequential a-b-t model. For example, Figure 3.4 shows three types of intera tions that are supported by the Network News Transfer Proto ol (NNTP) [KL86, Bar00℄. The rst a-b-t diagram exhibits the straightforward behavior of an NNTP reader (i.e., a lient for reading newsgroup postings) posting a new arti le. The two endpoints ex hange a few ontrol messages in the rst three epo hs, and then the lient uploads the ontent of the arti le in ADU a4. The se ond onne tion shows an NNTP reader using a TCP onne tion to rst he k whether the server knows about any new arti les in two newsgroups (un .support and un .test). After that, the reader requests an overview of those messages (using XOVER). The server replies with the subje ts of the new arti les and some other information. Finally, after a 5.02 se onds of ina tivity, the reader requests the ontent of one of the new arti les. This relatively long time suggests that the user of the NNTP reader waited some time before a tually requesting the reader to display the ontent of a new arti le. The way NNTP servers intera t is illustrated in the third onne tion. One of the peers will ask the other about new newsgroups and arti les. This typi ally involves hundreds or even thousands of ADUs sent in ea h dire tion. The onne tion shown here has only a small subset of the ADUs observed in one of these onne tions between NNTP peers. Here the initiator peer asked for new groups rst, and then for new arti les. One arti le was sent from the initiator to the a eptor, and another one in the opposite dire tion. These examples provide a good illustration of the omplexity of modeling appli ations one by one, and they provide further eviden e supporting the laim that our abstra t sour e-level model is widely appli able. In general, the use of a multi-epo h model is essential to a urately des ribe how appli ations drive TCP onne tions. In orporating Quiet Times into Sour e-Level Modeling Unlike ADUs, whi h ow from the initiator to the a eptor or vi e versa, quiet times are not asso iated with any parti ular dire tion of a TCP onne tion. However, we have hosen 55 to use two types of quiet times in our sequential a-b-t model. This hoi e is motivated by the intended meaning of quiet time, and by the di eren e between the duration of the quiet times observed at di erent points in the onne tion's path. When we were developing the model, we initially onsidered quiet times independent of the endpoint ausing them. They were simply \ onne tion quiet times". In pra ti e, quiet times in sequential onne tions are asso iated with sour e-level behavior in only one of the endpoints. For example, a \user think time" in an HTTP onne tion is asso iated with a quiet time on the initiator side (whi h is waiting for the user a tion), while a server pro essing delay in a Telnet onne tion is asso iated with the a eptor side (whi h is waiting for a result). In every ase, one endpoint is quiet for some period before sending new data, and the other endpoint remains quiet, waiting for these new data to arrive. Having two types of quiet times, ta and tb, makes it possible to annotate the side of the onne tion that is the sour e of the quiet time. The se ond reason for the use of two types of quiet times is that the duration of the quiet time depends on the point at whi h the quiet time is measured. The endpoint that is not the sour e of the quiet time will observe a quiet time that depends on the network and not only on the sour e-level behavior of the other endpoint. This is be ause the new ADU whi h de nes the end of the quiet time needs some time to rea h its destination. In the example in Figure 3.2, the quiet time between a1 and b1 observed by the server endpoint is very small (only the time needed to retrieve the requested URL). However, this quiet time is longer when observed by the lient, sin e it is the time between the last so ket write of a1 and the rst so ket read of b1. It in ludes the server pro essing time, and at least one full round-trip time. Ideally, we would like to measure this quiet time ta1 on the server side, in order to hara terize sour e-level behavior in a ompletely network-independent manner. Similarly, we would like to measure tb1 on the lient side. In summary, sour e-level quiet times are non-dire tional, in the sense that they do not travel in one dire tion or the other, but they are asso iated with one of the endpoints, whi h is the sour e of the quiet time. 56 ! " # $ " # ! % "# & $ %' Figure 3.5: An a-b-t diagram illustrating a server push from a web am using a persistent HTTP onne tion. 3.1.2 Beyond Client/Server Appli ations Not all appli ations follow the stri t pattern of requests and responses that hara terizes traditional lient/server appli ations. For example, HTTP is ommonly used for server push operations5, in whi h the server periodi ally refreshes the state of the lient without any prior request. Figure 3.5 illustrates this behavior using a TCP onne tion where a web browser rst requests a web am URL (UNC's \Pit am" in this example), and the web server responds with a sequen e of image frames separated by small quiet times. The browser renders ea h frame as soon as it is re eived, reating a ontinuous movie. Ea h frame an be onsidered an individual ADU, so this onne tion does not follow the basi request/response sequen e of previous examples. The notation provided by the sequential a-b-t model an still be used to represent this sour e-level behavior using the onne tion ve tor (e1; e2; e3; e4; e5) where e1 = (392; 0:041; 97939; 0); e2 = (0; 0:057; 97942; 0); e3 = (0; 0:035; 97820; 0); e4 = (0; 0:054; 97820; 0); and e5 = (0; 0:037; 98019; 0): While this onne tion has no natural epo hs in the request/response sense, we an des ribe the onne tion by assigning ea h frame to a separate bj, and ea h quiet time between frames to a taj (sin e the onne tion ve tor is intended to apture a quiet time on the server side). The same type of server push behavior is found in streaming appli ations. A TCP onne tion arrying I e ast traÆ (from ibiblio.org) is shown in Figure 3.6. I e ast is a popular 5HTTP server push is implemented using a spe ial ontent type, x-mixed-repla e, whi h makes the browser expe t a response obje t that is omposed of other obje ts (separated by a on gurable boundary string). Sin e no limit is imposed on the number of obje ts in this omposite, web am movies are usually implemented as a simple sequen e of JPEG images that the web browser reads and renders ontinuously until the user moves to another page. This type of web servi e should not be onfused with HTML's automati page refresh tag, whi h is ommonly used for slow rate web ams (e.g., one image every 30 se onds). In this ase, the browser refreshes the urrent page by downloading again the urrent page and hen e the intera tion follows the regular request/response pattern. 57 ! " ! "" " # $ % Figure 3.6: An a-b-t diagram illustrating I e ast audio streaming in a TCP onne tion. &' ( )*+,( ,+( ' +( ,-( *-.( * & ( ,)( -( *)( +( / ,( 012 3 +( / )( 456 7 8 9 : ; < := *>( ??@ A BC ; 8 D 9 E 9 3 F GH 1 ?66 3 I8 9 21 JH ?4K K4?L?6MN G O2P 456 7 8 9 : < := *)( ??@ A BC8 D 9 E 9 * & ( QR3Q *>( ?66 S +( / ,( TU1 3 ??4 V BB WX I 9 ??6 G YB Z3GR U2 H[ G O22 KK4 O :B:I ; CB\] ?K6 ^ 9 D ; _ BC 9 +( / ,( G O2P ??` G a ]] < b 9 / ,( ??` G a ]] < b 9 * & ( [H3[ cde f g h i fj k fj & , l *)* l >-.( m n op l l n R< Y 9_ E BYI 0 < ] E< := Z< D 9 = __ ; K q K q ? q E a Y q =r cde f g h i fj k fj cde f g h i fj k fj c sttou n v st ,w xhy c st n zs {| c sttou n v st * w xhy} ~ n ~ | c sttou n v st & w xhy} ~ n ~ | Figure 3.7: Three a-b-t diagrams of onne tions taking part in the intera tion between an FTP lient and an FTP server. audio streaming appli ation that follows the same pattern of ADUs dis ussed in the previous paragraph, and an be des ribed using the same type of onne tion ve tor. Ea h bj is asso iated to an MPEG audio frame. Note that the sizes of the ADUs and the durations of the quiet times between them are highly variable, unlike the example in Figure 3.5. Perhaps surprisingly, TCP is widely used for arrying streaming traÆ today, despite its inability to perform the typi al trade-o between loss re overy and delay in multimedia appli ations. Streaming over TCP has two signi ant bene ts: Streaming traÆ an use TCP port numbers asso iated with web traÆ and therefore over ome rewalls that blo k other port numbers. This is important for web sites that deliver web pages and multimedia streams, sin e it guarantees that the user will be able to download the multimedia ontent. Most lients experien e su h low loss rates, that TCP's loss re overy me hanisms have an insigni ant impa t on the timing of the stream. The ommon use of stream bu ering prior to the beginning of the playba k further redu es the impa t of loss re overy. The intera tion between the two endpoints of a lient/server appli ation does not generally 58 require more than one TCP onne tion to be opened between the two endpoints. As we have seen, some appli ations use a new onne tion for ea h request/response ex hange, while others make use of multi-epo h onne tions (e.g., persistent onne tions in HTTP/1.1). Handling more than one TCP onne tion an have some performan e bene ts, but it does ompli ate the implementation of the appli ations (e.g., it may require using on urrent programming te hniques). However, some appli ations do intera t using several TCP onne tions and this reates interdependen ies between ADUs. For example, Figure 3.7 illustrates an FTP session6 between an FTP lient program and FTP server in whi h three onne tions are used. The onne tion in the top row is the \FTP ontrol" onne tion used by the lient to rst identify itself (with username and password), then list the ontents of a dire tory, and then retrieve a large le. The a tual dire tory listing and the le are re eived using separate \FTP data" onne tions (established by the lient) with a single ADU b1. The gure illustrates how the start of the data onne tions depends on the use of some ADUs in the ontrol onne tion (i.e., the dire tory listing LIST does not o ur until after the RETR ADUs has been re eived), and how the ontrol onne tion does not send the 226 Complete ADU until the data onne tions have ompleted. While the sequential a-b-t model an a urately des ribe the sour e-level properties of these three onne tions, the model annot apture the interdependen y between the onne tions. The FTP example in Figure 3.7 shows three onne tions with a strong dependen y. The two FTP data onne tions ne essarily followed a 150 Opening operation in the FTP ontrol onne tion. Our urrent model annot express this kind of dependen ies between onne tions or between the ADUs of more than one onne tion. It would be possible to develop a more sophisti ated model apable of des ribing these types of dependen ies, but it seems very diÆ ult to populate su h a model from tra es in an a urate manner without knowledge of appli ation semanti s. As an alternative, the traÆ generation approa h proposed in this dissertation arefully reprodu es relative di eren es in onne tion start times, whi h tend to preserve temporal dependen ies between onne tions. Our experimental results also suggest that the impa t of inter onne tion 6This is an abbreviated version of the original session, in whi h there was some dire tory navigation and more dire tory listings. The ontrol onne tion used port 21, while the data onne tions used dynami ally sele ted port numbers. Note also that signi ant inter-ADU times due to user think time are not shown in the diagram. 59 ! " ! # ! " ! $ %& ' ( ) * + , -./ 0 , 1 / 23 4 0 3 ! 5 ! / 0 3 " ! / 23 4 0 3 # ! 1 / 23 4 0 3 5 ! Figure 3.8: An a-b-t diagram illustrating an NNTP onne tion in \stream-mode", whi h exhibits data ex hange on urren y. 6 7 89 :;< =99> ? =99>@ @ A B 6 CDDEF B = DC B CGC H :;< @ A B 6 CDDEF B = DC B CGC H : IJ < @ A B KA E HL : IJ < @ A B KA E HL I < M FG N C O E I < 7 F B EDEP B E L I < 7 F B EDEP B E L Q J < Q:RS J < T B EP B = A EGEU > EVWEP B = A EGEU Q J < > EVWEP B = A EGE X Q J < > EVWEP B = A EGEY Q J < > EVWEP B = A EGEZ Q J < > EVWEP B = A EGE[ Q:RS J < T B EP B = A EGE X Q:RS J < T B EP B = A EGEY Q:RS J < T B EP B = A EGEZ Q:RS J < T B EP B = A EGE [ Figure 3.9: An a-b-t diagram illustrating the intera tion between two BitTorrent peers. dependen ies is negligible, at least for our olle tion Internet tra es. 3.2 The Con urrent a-b-t Model In the sequential model we have onsidered so far, appli ation data is either owing from the lient to the server or from the server to the lient. However, some TCP onne tions are not driven by this traditional style of lient/server intera tion. Some appli ations send data from both endpoints of the onne tion at the same time. Figure 3.8 shows an NNTP onne tion between two NNTP peers (servers) in whi h NNTP's \streaming mode" is used. As shown in the diagram, ADUs b5 and b6 are sent from the onne tion a eptor to the onne tion initiator while ADU a6 is being sent in the opposite dire tion. ADUs b5 and b6 arry 438 messages, where the a eptor NNTP peer tells the initiator that it is not interested in arti les id3 and id4. ADU a6 arried arti le id2 in the opposite dire tion. There is no ausal dependen y between these ADUs, whi h make it possible for the two endpoints to send data independently. Therefore this onne tion is said to exhibit data ex hange on urren y in the sense that one or more pairs of ADUs are ex hanged simultaneously. In ontrast, the onne tions illustrated in 60 previous gures ex hanged data units in a sequential fashion. A fundamental di eren e between these two types of ommuni ation patterns is that sequential request/response ex hanges (i.e., epo hs) always take a minimum of one round-trip time. Data ex hange on urren y makes it possible to send and re eive more than one ADU per round-trip time, and this an in rease throughput substantially. In the gure, the initiator NNTP peer is able to send he k requests to the other party qui ker be ause it an do so without waiting for the orresponding responses, ea h of whi h would take a minimum of one full round-trip time to arrive. Another example of on urrent data ex hange is shown in Figure 3.9. Here two BitTorrent peers [Coh03℄ ex hange pie es of a large le that both peers are trying to download. The BitTorrent proto ol supports the ba klogging of requests (i.e., pie es k and m of the le are requested before the download of the pre eding pie e is ompleted), and also the simultaneous ex hange of le pie es (i.e., the transmission of pie es k and l of the le oexist with the transmission of pie e m). As dis ussed above, this type of behavior helps to avoid quiet times in BitTorrent onne tions, thereby in reasing average throughput. Furthermore, this example illustrates a type of appli ation in whi h both endpoints a t as lient and server (both request and re eive le pie es). Appli ation designers make use of data on urren y for two primary purposes: Keeping the pipe full, by making use of requests that overlap with un ompleted responses. Rather than waiting for the response of the last request to arrive, the lient keeps sending new requests to the server, building up a ba klog of pending requests. The server an therefore send responses ba k-to-ba k, and maximize its use of the path from the server to the lient. Without on urren y, the server remains idle between the end of a response and the arrival of a new request, hen e the path annot be fully utilized. Supporting \natural" on urren y, in the sense that some appli ations do not need to follow the traditional request/response paradigm. In some ases, the endpoints are genuinely independent, and there is no natural on ept of request/response. 61 Examples of proto ols that attempt to keep the pipe full are the pipelining mode in HTTP, the streaming mode in NNTP, the Rsyn proto ol for le system syn hronization, and the BitTorrent proto ol for le-sharing. Examples of proto ols/appli ations that support natural on urren y are instant messaging and Gnutella (in whi h the sear h messages are simply forwarded to other peers without any response message). Sin e BitTorrent supports lient/server ex hanges in both dire tions, and these ex hanges are independent of ea h other, we an say that BitTorrent also supports a form of natural on urren y. For dataon urrent onne tions, we use a di erent version of our a-b-t model in whi h the two dire tions of the onne tion are modeled independently by a pair ( ; ) of onne tion ve tors of the form = ((a1; ta1); (a2; ta2); : : : ; (ana ; tana)) and = ((b1; tb1); (b2; tb2); : : : ; (bnb ; tbnb)) Depending on the nature of the on urrent onne tion, this model may or may not be a simpliation. If the sides of the onne tion are truly independent, the model is a urate. Otherwise, if some dependen y exists, it is not re e ted in our hara terization (e.g., the fa t that request ai ne essarily pre eded response bj is lost). Our urrent data a quisition te hniques annot distinguish these two ases, and we doubt that a te hnique to a urately distinguish them exists. In any ase, the two independent ve tors in our on urrent a-b-t model provide enough detail to apture the two uses of on urrent data ex hange in a manner relevant for traÆ generation. In the ase of pipelined requests, one side of the onne tion mostly arries large ADUs with little or no quiet time between them (i.e., ba klogged responses). The exa t timing at whi h the requests arrive in the opposite dire tion is irrelevant as long as there is always an ADU arrying a response to be sent. It is pre isely the purpose of the on urren y to de ouple the two dire tions to avoid the one round-trip time per request/response pair that sequential onne tions must in ur in. There is, therefore, substantial independen e in on urrent onne tions of this type, whi h supports the use of a model like the one we propose. In the ase of 62 onne tions that are \naturally" on urrent, the two sides are a urately des ribed using two separate onne tion ve tors. 3.3 Abstra t Sour e-Level Measurement The a-b-t model provides an intuitive way of des ribing sour e behavior in an appli ationneutral manner that is relevant for the performan e of TCP. However, this would be of little use without a method for measuring real network traÆ and asting TCP onne tions into the a-b-t model. We have developed an eÆ ient algorithm that an onvert an arbitrary tra e of TCP/IP proto ol headers into a set of onne tion ve tors. The algorithm makes use of the wealth of information that segment headers provide to extra t an a urate des ription of the abstra t sour e-level behavior of the appli ations driving ea h TCP onne tion in the tra e. It should be noted that this algorithm is a rst solution to a omplex inferen e problem in whi h we are trying to understand appli ation behavior from the segment headers of a measured TCP onne tion without examining payloads, and hen e without any knowledge of the identity of the appli ation driving the onne tion. This implies \reversing" the e e ts of TCP and the network me hanisms that determine how ADUs are onverted into the observed segments that arry the ADU. The presented algorithm is by no means the only one possible, or the most sophisti ated one. However, we believe it is suÆ iently a urate for our purpose, and we provide substantial experimental eviden e in this and later hapters to support this laim. 3.3.1 From TCP Sequen e Numbers to Appli ation Data Units The starting point of the algorithm is a tra e of TCP segment headers, Th, measured on some network link. Our te hnique applies to TCP onne tions for whi h both dire tions are measured (known as a bidire tional tra e), but we will also omment on the problem of extra ting a-b-t onne tion ve tors from a tra e with only one measured dire tion (a unidire tional tra e). While most publi tra es are bidire tional (e.g., those in the NLANR repository [nlaa℄), unidire tional tra es are sometimes olle ted when resour es (e.g., disk spa e) are limited. Furthermore, 63 4 : DAT A 5: ACK 6: DATA 7: DATA 8 : ACK 9: FI N 1 0 : F INACK 1 1 : F IN 12: FI NACK 1 : SY N 2: SYNACK 3 : ACK se qn o 3 4 1 ack no 342 s eqno 1460 s eqno 2555 a c k n o 2 5 5 6 TIME Monitoring Point 1 Monitoring Point 2 t1 t2 Initiator Endpoint Acceptor Endpoint a c k n o 1 ack no 342 ack no 342 Figure 3.10: A rst set of TCP segments for the onne tion ve tor in Figure 3.1: lossless example. routing asymmetries often result in onne tions that only traverse the measured link in one dire tion. We will use Figure 3.10 to des ribe the basi te hnique for measuring ADU sizes and quiet time durations. The gure shows a set of TCP segments representing the ex hange of data illustrated in the a-b-t diagram of Figure 3.1. After onne tion establishment ( rst three segments), a data segment is sent from the onne tion initiator to the onne tion a eptor. This data segment arries ADU a1, and its size is given by the di eren e between the end sequen e number and the beginning sequen e number assigned to the data (bytes 1 to 341). In response to this data segment, the other endpoint rst sends a pure a knowledgment segment (with umulative a knowledgment number 342), followed by two data segments (with the same a knowledgment numbers). This hange in the dire tionality of the data transmission makes it possible to establish a boundary between the rst data unit a1, whi h was transported using a single segment and had a size of 341 bytes, and the se ond data unit b1, whi h was transported using two segments and had a size of 2,555 bytes. The tra e of TCP segments Th must in lude a timestamp for ea h segment that reports the time at whi h the segment was observed at the monitoring devi e. Timestamps provide a way 64 of estimating the duration of quiet times between ADUs. The duration of ta1 is given by the di eren e between the timestamp of the 4th segment (the last and only segment of a1), and the timestamp of the 6th segment (the rst segment of b1). The duration of tb1 is given by the di eren e between the timestamp of the last data segment of b1 (7th segment in the onne tion) and the timestamp of the rst FIN segment (8th segment in the onne tion). Note that the lo ation of the monitoring point between the two endpoints a e ts the measured duration of ta1 and tb1 (but not the measured sizes of a1 and b1). Measuring the duration of ta1 from the monitoring point 1 shown in Figure 3.10 results in an estimated time t1 that is larger than the estimated time t2 measured at monitoring point 2. Inferring appli ation-layer quiet time durations is always ompli ated by this kind of measurement variability (among other auses), so short quiet times (with durations up to a few hundred millise onds) should not be taken into a ount. Fortunately, the larger the quiet time duration, the less signi ant the measurement variability be omes, and the more important the e e t of the quiet time is on the lifetime of the TCP onne tion. We an therefore hoose to assign a value of zero to any measured quiet time whose duration is below some threshold, e.g., 1 se ond, or simply use the measurement disregarding the minor impa t of its ina ura y. If all onne tions were as \well-behaved" as the one illustrated in Figure 3.10, it would be trivial to reate an algorithm to extra t onne tion ve tors from segment header tra es. This ould be done by simply examining the segments of ea h onne tion and ounting the bytes sent between data dire tionality hanges. In pra ti e, segment reordering, loss, retransmission, dupli ation, and on urren y make the analysis mu h more ompli ated. Figure 3.11 shows a se ond set of segment ex hanges that arry the same a-b-t onne tion ve tor of Figure 3.1. The rst data segment of the ADU sent from the onne tion a eptor, the 6th segment, is lost somewhere in the network, for ing this endpoint to retransmit this segment some time later as the 9th segment. Depending on the lo ation of the monitor (before or after the point of loss), the olle ted segment header tra e may or may not in lude the 6th segment. If this segment is present in the tra e (like in the tra e olle ted at monitoring point 2), the analysis program must dete t that the 9th segment is a retransmission and ignore it. This ensures we ompute 65 TIME Monitoring Point 1 Monitoring Point 2 t2 Initiator Endpoint Acceptor Endpoint 4 : DAT A 5: ACK 6: DATA 7: DATA 8 : ACK 10: FI N 1 1 : F INACK 1 2 : F IN 13: FI NACK 1 : SY N 2: SYNACK 3 : ACK 9: DATA t1 se qn o 3 4 1 ack no 342 s eqno 1460 s eqno 2555 a c k n o 1 a c k n o 1 ack no 342 ack no 342 ack no 342 s eqno 1460 Figure 3.11: A se ond set of TCP segments for the onne tion ve tor in Figure 3.1: lossy example. the orre t size of b1, i.e., 2,555 bytes rather than 4,015 bytes. If the lost segment is not present in the tra e (like in the tra e olle ted at monitoring point 1), the analysis must dete t the reordering of segments using their sequen e numbers and still output a size for b1 of 2,555 bytes. Measuring the duration of ta1 is more diÆ ult in this ase, sin e the monitor never saw the 6th segment. The best estimation is the time t1 between the segment with sequen e number 341 and the segment with sequen e number 2555. Note that if the 6th segment is seen (as for a tra e olle ted at monitoring point 2), the best estimate is the time t2 between 5th and 6th segments. A data a quisition algorithm apable of handling these two ases annot simply rely on ounts and data dire tionality hanges, but must keep tra k of the start of the urrent ADU, the highest sequen e number seen so far, and the timestamp of the last data segment. In our analysis, rather than trying to handle every possible ase of loss and retransmission, we rely on a basi property of TCP to onveniently reorder segments and still obtain the same ADU sizes and inter-ADU quiet time durations. This makes our analysis simpler and more robust. 66 3.3.2 Logi al Order of Data Segments A fundamental invariant that underlies our previous ADU analyses is that every byte of appli ation data in a TCP onne tion re eives a sequen e number, whi h is unique for its dire tion7. This property also means that data segments transmitted in the same dire tion an always be logi ally ordered by sequen e number, and this order is independent of both the time at whi h segments are observed and any reordering present in the tra e. The logi al order of data segments is a very intuitive notion. If segments 6 and 7 in Figure 3.10 arried an HTML page, segment 6 arried the rst 1,460 hara ters of this page, while segment 7 arried the remaining 1,095. Segment 6 logi ally pre eded segment 7. When the same page is transmitted in Figure 3.11, the rst half of the HTML is in segment 6 (whi h was lost) and again in segment 9. Both segments 6 and 9 (whi h were identi al) logi ally pre ede segment 7, whi h arried the se ond half of the HTML page. The notion of logi al order of data segments an also be applied to segments owing in opposite dire tions of a sequential TCP onne tion. Ea h new data segment in a sequential onne tion must a knowledge the nal sequen e number of the last in-order ADU re eived in the opposite dire tion. If this is not the ase, then the new data is not sent in response to the previous ADU, and the onne tion is not sequential (i.e., two ADUs are being sent simultaneously in opposite dire tions). In the previous examples in Figures 3.10 and 3.11, we an see that both data segments omprising b1 a knowledge the nal sequen e number of a1. Intuitively, no data belonging to b1 an be sent by the server until a1 is ompletely re eived and pro essed. The data in a1 logi ally pre ede the data in b1, and therefore the segment arrying a1 logi ally pre edes the segments arrying b1. Given the sequen e and a knowledgment numbers of two data segments, owing in the same or in opposite dire tions, we an always say whether the two segments arried the same data, or one of them logi ally pre eded the other. Conne tions that t into the sequential a-b-t model are said to preserve a total order of data 7This is true as long as the onne tion arries 4 GB or less. Otherwise, sequen e numbers are repeated due to the wraparound of their 32-bit representation. We dis uss how to address this diÆ ulty at the end of Se tion 3.3.3. 67 segments with respe t to the logi al ow of data: For any pair of data segments p and q, su h that p is not a retransmission of q or vi e versa, either the data in p logi ally pre edes the data in q, or the data in q logi ally pre edes the data in p. In the example in Figure 3.11, the data in segment 9 logi ally pre edes the data in segment 7 (e.g., segment 9 arries the rst 1460 bytes of a web page, and segment 7 arries the rest of the bytes). We know this be ause the sequen e numbers of the bytes in segment 9 are below the sequen e numbers of the bytes in segment 7. The rst monitoring point observes segment 7 before segment 9, so temporal order of these two segments did not mat h their logi al data order. A total order also exists between segments that ow in opposite dire tions. In the example in Figure 3.11, the data in segment 4 logi ally pre ede the data arried in the rest of the data segments in the onne tion. Timestamps and segment reordering play no role in the total order that exists in any sequential onne tion. Logi al data order is not present in dataon urrent onne tions, su h as the one shown in Figure 3.8. For example, the segment that arried the last b-type ADU (the 438 don't send ADU) may have been sent roughly at the same time as another segment arrying some of the new data of the data unit sent in the opposite dire tion (su h as a CHECK ADU). Ea h segment would use new sequen e numbers for its new data, and it would a knowledge the data re eived so far by the endpoint. Sin e the endpoints have not yet seen the segment sent from the opposite endpoint, the two segments annot a knowledge ea h other. Therefore, there is no ausality between the segments, and no segment an be said to pre ede the other. This observation provides a way of dete ting data on urren y purely from the analysis of TCP segment headers. The idea is that a TCP onne tion that violates the total order of data segments des ribed above an be said to be on urrent with ertainty. This happens whenever a pair of data segments, sent in opposite dire tions, do not a knowledge ea h other, and therefore annot be ordered a ording the logi al data order. Formally, a onne tion is onsidered to be on urrent when there exists at least one pair of 68 data segments p and q that either ow in opposite dire tions and satisfy p:seqno > q:a kno (3.1) and q:seqno > p:a kno; (3.2) or that ow in the same dire tion and satisfy p:seqno > q:seqno (3.3) and q:a kno > p:a kno: (3.4) , Where p:seqno and q:seqno are the sequen e numbers of p and q respe tively, and p:a kno and q:a kno are the a knowledgment numbers of p and q respe tively. Note that, for simpli ity, our :a kno refers to the umulative sequen e number a epted by the endpoint (whi h is one unit below the a tual a knowledgment number stored in the TCP header [Pos81℄). The rst pair of inequalities de nes the bidire tional test of data on urren y, while the se ond pair de nes the unidire tional test of data on urren y. We next dis uss why a onne tion satisfying one of these tests must ne essarily be asso iated with on urrent data ex hanging. We onsider rst the ase where p and q ow in opposite dire tions, assuming without loss of generality that p is sent from initiator to a eptor and q from a eptor to initiator. If they are part of a sequential onne tion, either p is sent after q rea hes the initiator, in whi h ase p a knowledges q so q:seqno = p:a kno, or q is sent after p rea hes the a eptor in whi h ase p:seqno = q:a kno. Otherwise, a pair of data segments that do not a knowledge ea h other exists, and the onne tion exhibits data on urren y. In the ase of segments p and q owing in the same dire tion, we assume without loss of generality that p:seqno < q:seqno. The only way in whi h q:a kno an be less than p:a kno is when p is a retransmission sent after q, and at least one data segment k with new data sent 69 from the opposite dire tion arrives between the sending of p and the sending of q. The arrival of k in reases the umulative a knowledgment number in p with respe t to q, whi h means that q:a kno < p:a kno. In addition, k annot a knowledge p, or p would not be retransmitted. This implies that the onne tion is not sequential, sin e the opposite side sent new data in k without waiting for the new data in p. Thus, only dataon urrent onne tions have a pair of segments that an simultaneously satisfy inequalities (3.1) and (3.2) or inequalities (3.3) and (3.4). These inequalities provide a formal test of data on urren y, whi h we will use to distinguish sequential and on urrent onne tions in our data a quisition algorithm. Dataon urrent onne tions exhibit a partial order of data segments, sin e segments owing in the same dire tion an always be ordered a ording to sequen e numbers, but not all pairs of segments owing in opposite dire tions an be ordered in this manner. Situations in whi h all of the segments in a on urrent data ex hange are a tually sent sequentially are not dete ted by the previous test. This an happen purely by han e, when appli ations send very little data or send it so slowly that on urrent data sent in the reverse dire tion is always a knowledged by ea h new data segment. Note also that the test dete ts on urrent ex hanges of data and not on urrent ex hanges of segments in whi h a data segment and an a knowledgment segment are sent on urrently. In the latter ase, the logi al order of data inside the onne tion is never broken be ause there is no data on urren y. Similarly, the simultaneous onne tion termination me hanism in TCP in whi h two FIN segments are sent on urrently is usually not asso iated with data on urren y. In the most ommon ase, none of the FIN segments or only one of them arries data, so the data on urren y de nition is not appli able. It is however possible to observe a simultaneous onne tion termination where both FIN segments arry data, whi h is onsidered on urren y if these segments satisfy inequalities (3.1) and (3.2). 70 3.3.3 Data Analysis Algorithm We have developed an eÆ ient data analysis algorithm that an determine whether a onne tion is sequential or on urrent, and an measure ADU sizes and quiet time durations in the presen e of arbitrary reordering, dupli ation, and loss. Rather than trying to analyze every possible ase of reordering, dupli ation/retransmission, and loss, we rely on the logi al data order property, whi h does not depend on the original order and timestamps. Given the set of segment headers of a TCP onne tion sorted by timestamp, the algorithm performs two passes: 1. Insert ea h data segment as a node into the data stru ture ordered segments. This is a list of nodes that orders data segments a ording to the logi al data order (bidire tional order for sequential onne tions, unidire tional order for on urrent onne tions). The insertion pro ess serves also to dete t data ex hange on urren y. This is be ause onne tions are initially onsidered sequential, so their segments are ordered bidire tionally, until a segment that annot be inserted a ording to this order is found. No ba ktra king is needed after this nding, sin e bidire tional order implies unidire tional order for both dire tions. 2. Traverse ordered segments and output the a-b-t onne tion ve tor (sequential or onurrent) for the onne tion. This is straight-forward pro ess, sin e segments in the data stru ture are already ordered appropriately. The rst step of the algorithm reates a doubly-linked list, ordered segments in whi h ea h list node represents a data segment using the following four elds: seqnoA: the sequen e number of the segment in the initiator to a eptor dire tion (that we will all the A dire tion). This sequen e number is determined from the nal sequen e number of the segment (if the segment was measured in the \A" dire tion), or from the umulative a knowledgment number (if measured in the \B" dire tion). 71 seqnoB: the sequen e number of the segment in the a eptor to initiator dire tion. dir: the dire tion in whi h the segment was sent (A or B). ts: the monitoring timestamp of the segment. The list always preserves the following invariant that we all unidire tional logi al data order : for any pair of segments p and q sent in the same dire tion D, the ordered segments node of p pre edes the ordered segments node of q if and only if p:seqnoD < q:seqnoD. At the same time, if the onne tion is sequential, the data stru ture will preserve a se ond invariant that we all bidire tional logi al data order , whi h is the opposite of the data on urren y onditions de ned above: for any pair of segments p and q, the ordered segments node of p pre edes the ordered segments node of q if and only if (p:seqnoA < q:seqnoA) ^ (p:seqnoB = q:seqnoB) or (p:seqnoA = q:seqnoA) ^ (p:seqnoB < q:seqnoB): Insertion of a node into the list starts ba kward from the tail of the ordered segments looking for an insertion point that would satisfy the rst invariant. If the onne tion is still being onsidered sequential, the insertion point must also satisfy the se ond invariant. This implies omparing the sequen e numbers of the new segment with those of the last segment in the ordered segments. The omparison an result in the following ases: The last segment of ordered segments pre edes the new one a ording to the bidire tional order above. If so, the new segment is inserted as the new last element of ordered segments. The last segment of ordered segments and the new segment have the same sequen e numbers. In this ase, the new segment is a retransmission and it is dis arded. 72 The new segment pre edes the last segment of ordered segments a ording to the bidire tional order. This implies that network reordering of TCP segments o urred, and that the new segment should be inserted before the last segment of ordered segments to preserve the bidire tional order of the data stru ture. The new segment is then ompared with the prede essors of the last segment in ordered segments until its proper lo ation is found, or inserted as the rst segment if no prede essors are found. The last segment of ordered segments and the new segment have di erent sequen e numbers and do not show bidire tional order. This means that the onne tion is on urrent. The segment is then inserted a ording to its unidire tional order. Sin e TCP segments an be re eived out of order by at most W bytes (the size of the maximum re eiver window), the sear h pass (third bullet) never goes ba kward more than W segments. Therefore, the insertion step takes O(s W ) time, where s is the number of TCP data segments in the onne tion. The se ond step is to walk through the linked list and produ e an a-b-t onne tion ve tor. This an be a omplished in O(s) time using ordered segments. For on urrent onne tions, the analysis goes through the list keeping separate data for ea h dire tion of the onne tion. When a long enough quiet time is found (or the onne tion is losed), the algorithm outputs the size of the ADU. For sequential onne tions, the analysis looks for hanges in dire tionality and outputs the amount of data in between the hange as the size of the ADU. SuÆ iently long quiet times also mark ADU boundaries, indi ating an epo h without one of the ADUs. Reordering makes the omputation of quiet times more omplex than it seems. As shown in Figure 3.11, if the monitor does not see the rst instan e of the retransmitted segment, the quiet times should be omputed based on the segments with sequen e numbers 341 and 2555. This implies adding two more elds to the list nodes: min ts: the minimum timestamp of any segment whose position in the order is not lower than the one represented by this node. Due to reordering, one segment an pre ede 73 another in the bidire tional order and at the same time have a greater timestamp. In this ase, we an use the minimum timestamp as a better estimate of the send time of the lower segment. max ts: the maximum timestamp of any segment whose pla e in the order is not greater than the one represented by this node. This is the opposite of the previous min ts eld, providing a better estimate of the re eive time of the greater segment. These elds an be omputed during the insertion step without any extra omparison of segments. The best possible estimate of the quiet time between two ADU be omes q:min ts p:max ts for p being the last segment (in the logi al data order) of the rst ADU, and q being the rst segment (in the logi al data order) of the se ond ADU. For the example in Figure 3.11, at monitoring point 1, the value of min ts for the node for the 9th segment (that marks a data dire tionality boundary when segment nodes are sorted a ording to the logi al data order) is the timestamp of the 7th segment. Therefore, the quiet time ta1 is estimated as t1. Note that the use of more than one timestamp makes it possible to handle IP fragmentation elegantly. Fragments have di erent timestamps, so a single timestamp would have to be arbitrarily set to the timestamp of one of the fragments. With our algorithm, the rst fragment provides sequen e numbers and usually min ts, while the last fragment usually provides max ts. Other Issues in Tra e Pro essing Our tra e pro essing algorithm makes two assumptions. First, it assumes we an isolate the segments of individual onne tions. Se ond, it assumes that no wraparound of sequen e numbers o urs (otherwise, logi al data order would not be preserved). These two assumptions an be satis ed by prepro essing the tra e of segment headers. Isolating the segments of individual TCP onne tions was a omplished by sorting pa ket header tra es on ve keys: sour e IP address, sour e port number, destination IP address, destination port number, and 74 timestamp. The rst four keys an separate segments from di erent TCP onne tions as long as no sour e port number is reused. When a lient establishes more than one onne tion to the same server (and servi e), these onne tions share IP addresses and destination port numbers, but not sour e port numbers. This is true unless the lient is using so many onne tions that it reuses a previous sour e port number at some point. Finding su h sour e port number reuses is relatively ommon in our long tra es, whi h are at least one hour long. Sin e segment tra es are sorted by timestamp, it is possible to look for pure SYN segments and use them to separate TCP onne tions that reuse sour e port numbers. However, SYN segments an su er from retransmissions, just like any other segment, so the pro essing must keep tra k of the sequen e number of the last SYN segment observed. Depending on the operating system of the onne tion initiator, this sequen e number is either in remented or randomly set for ea h new onne tion. In either ase, the probability of two onne tions sharing SYN sequen e numbers is pra ti ally zero. Segment sorting a ording to the previous 5 keys requires O(s log s) time (we use the Unix sort utility for our work). It is also possible to pro ess the data without an initial sorting step by keeping state in memory for ea h a tive onne tion. On the one hand, this an potentially eliminate the ostly O(s log s) step, making the entire pro essing run in linear time. On the other hand, it ompli ates the implementation, and in reases the memory requirements substantially8. Dete ting the existen e of distin t onne tions with identi al sour e and destination IP addresses and port numbers requires O(s) time, simply by keeping tra k of SYN sequen e numbers as dis ussed above. In our implementation, this dete tion is done at the same time as segments are inserted into ordered segments data stru ture, saving one pass. TCP sequen e numbers are 32-bit integers, and the initial sequen e number of a TCP onne tion an take any value between 0 and 232 1. This means that wraparounds are possible, 8The well-known t ptra e tool [Ost℄, provides a good example of the diÆ ulty of eÆ iently implementing this te hnique. t ptra e an analyze multiple onne tions at the same time, by keeping separate state for ea h onne tion, and making use of hashing to qui kly lo ate the state orresponding to the onne tion to whi h a new segment belongs. When this tool is used with our tra es, we qui kly run out of memory on our pro essing ma hines (whi h have 1.5 GB of RAM). This o urs even when we use t ptra e's real-time pro essing mode, whi h is supposed to be highly optimized. We believe it is possible to perform our analysis without the sorting step, but it is ertainly mu h more diÆ ult to develop a memory-eÆ ient implementation. 75 and relatively frequent. One way to handle sequen e number wraparound is by keeping tra k of the initial sequen e number and performing a modular subtra tion. However, if the SYN segment of a onne tion is not observed (and therefore the initial sequen e number is unknown), using modular arithmeti will fail whenever the onne tion su ers from reordering of the rst observed segments. In this ase the subtra tion would start in the wrong pla e, i.e., from the sequen e number of the rst segment seen, whi h is not the lowest sequen e number due to the reordering. One solution is to use ba ktra king, whi h ompli ates the pro essing of tra es. A related problem is that representing sequen e numbers as 32-bit integers is not suÆ ient for onne tions that arry more than 232 bytes of data (4 GB). The simplest way to address this measurement problem is to en ode sequen e numbers using more than 32 bits in the ordered segments data stru ture. In our implementation we use 64 bits to represent sequen e numbers, and rely on the following algorithm9 to a urately onvert 32 bit sequen e numbers to 64-bit integers even in the presen e of wraparounds. The algorithm makes use of a wraparound ounter and a pair of ags for ea h dire tion of the onne tion. The obvious idea is to in rement the ounter ea h time a transition from a high sequen e number to a low sequen e number is seen. However, due to reordering, the ounter ould be in orre tly in remented more than on e. For example, we ould observe four segments with sequen e numbers 232 1000; 1000; 232 500, and 2000. Wraparound pro essing should onvert them into 232 1000; 232 + 1000; 232 500, and 232+2000. However, if the wraparound ounter is in remented every time a transition from a high sequen e number to a low sequen e number is seen, the ounter would be in remented on e for the segment with the sequen e number 1000 and again for the segment with sequen e number 2000. In this ase, the wraparound pro essing would result in four segments with sequen e numbers 232 1000; 232+1000; 232 500, and 232+232+2000. The se ond in rement of the ounter would be in orre t. The solution is to use a ag that is set after a \low" sequen e number is seen, so the ounter 9We have not addressed the extra omplexity that TCP window s aling for Long-Fat-Networks (RFC 1323 [JBB92℄) introdu es. It is often the ase that TCP options are not available in the tra es, so the use of window s aling and TCP timestamps has to be inferred from the standard TCP header. This is a daunting task. If the options are available, it is straightforward to ombine regular sequen e numbers and timestamps to handle this ase. 76 is in remented only on e after ea h \ rossing" of 232. This opens up the question of when to unset this ag so that the next true rossing in rements the ounter. This an be solved by keeping tra k of the rossing of the middle sequen e number. In our implementation, we use two ags, low seqno and high seqno, whi h are set independently. If the next segment has a sequen e number in the rst quarter of 232 (i.e., in the range between 0 and 230 1), the ag low seqno is set to true. If the next segment has a sequen e number in the fourth quarter of 232 (i.e., in the range between 231 and 232 1), the other aghigh seqno is set to true. These ags are unset, and the ounter in remented, when both ags are true and the next segment is not in the rst or the fourth quarter of 232. Sequen e numbers in the rst quarter are in remented to 232 times the ounter plus 1. The rest are in remented by 232 plus the ounter. This handles the pathologi al reordering ase in whi h the sequen e number of the rst segment in a onne tion is very lose to zero, and the next segment is very lose to 232. In this ase the low sequen e number would be in remented by 232. This algorithm requires no ba ktra king, and runs in O(s) time. In our implementation, the sequen e number onversion algorithm has been integrated into the same pass as the insertion step of the ADU analysis. Our data a quisition te hniques have been implemented in the analysis program t p2 ve . The program also handles a number of other diÆ ulties that arise when pro essing real tra es, su h as TCP implementations that behave in non-standard ways. In addition, it also implements the analysis of network-level parameters des ribed in the next hapter. 3.4 Validation using Syntheti Appli ations The data analysis te hniques des ribed in the previous se tion are based on a number of properties of TCP that are expe ted to hold for the vast majority of onne tions re orded. For example, the logi al data order property should always hold, sin e TCP would fail to deliver data to appli ations otherwise. There are, however, a number of possible sour es of un ertainty in the a ura y of the data a quisition method, and this se tion studies them using testbed experiments. 77 The on ept of an ADU provides a useful abstra tion for des ribing the demands of appliations for sending and re eiving data using a TCP onne tion. However, the ADU on ept is not really part of the interfa e between appli ations and TCP. In pra ti e, ea h TCP onne tion results from the use of a programming abstra tion, alled a so ket, that re eives requests from the appli ations to send and re eive data. These requests are made using a pair of so ket system alls, send() (appli ation's write) and re v() (appli ation's read). These alls pass a pointer to a memory bu er where the operating system an read the data to be sent or write the data re eived. The size of the bu er is not xed, so appli ations are free to de ide how mu h data to send or re eive with ea h all and an even use di erent sizes for di erent alls. As a result, appli ations may use more than one send system all per ADU, and there may be signi ant delays between su essive alls belonging to the same ADU. These operations an further intera t with me hanisms in the lower layers (e.g., delayed a knowledgment, TCP windowing, IP bu ering, et .) reating even longer delays between segments arrying ADUs. Su h delays distort the relationship between appli ation-layer quiet times and segment dynami s, ompli ating the dete tion of ADU boundaries due to quiet times. To test the a ura y of our data a quisition te hniques, we onstru ted a suite of test appli ations that exer ise TCP in a systemati manner. The basi logi of ea h test appli ation is to establish a TCP onne tion and send a sequen e of ADUs with a random size, and with random delays between ea h pair of ADUs. In the a-b-t model notation, this means reating onne tions with random ai, bi, tai and tbi. As the test appli ation runs, it logs ADU sizes and various time intervals as measured by the appli ation. In addition, the test appli ation an set the so ket send and re eive alls to random I/O sizes, and an introdu e random delays between su essive send or re eive alls within a single ADU. In our experiments, the test appli ation was run between two real hosts, and tra es of the segment headers were olle ted and analyzed using our measurement tool. Our validation ompared the result of this analysis and the orre t values logged by the appli ations. We ondu ted an extensive suite of tests, but limit our report to only some of the results. Spe i ally we only show the results with the most signi ant deviations from the orre t values 78

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Concept Space Comparison and Validation

We propose an automatic way of concept space[1] validation. We devise a number of measures for comparing concept spaces and then show how the measures can be used to validate concept spaces. The measures are evaluated empirically and applied to validate algorithms for concept space generation. We also suggest a protocol for combining our method of validation with precision/recall experiments, t...

متن کامل

(Measuring System Entropy Generation in a Complex Economic Network (The Case of Iran

An economic system is comprised of different primary flows that can be captured in macroeconomic models with complex network relations. Theoretically and empirically in this system, weak substitution or complementarity of environmental materials, like energy and other production factors such as capital, is undeniable. This is an effective critique on neoclassical economics. In this paper, we vi...

متن کامل

Attitudes towards English as an International Language (EIL) in Iran: Development and Validation of a New Model and Questionnaire

This study aimed at developing and validating a new model and instrument to explore attitudes of Iranian EFL learners towards English as an International Language (EIL). In so doing, the researchers followed several rigorous steps including extensive literature review, content selection, item generation, designing the rating scales and personal information part, Delphi technique, item revision,...

متن کامل

The future status of solid waste generation in Tehran metropolis with regression analysis method based on population

Background and Objective: Knowledge about the quantity of municipal solid waste (MSW) generation plays a key role in formulating policies of waste management. So far, different methods have been applied to estimate the quantity of waste generation. In this study, eight specific forms of mathematical functions were evaluated to predict waste generation by the regression analysis method based on ...

متن کامل

Predicting waste generation using Bayesian model averaging

A prognosis model has been developed for solid waste generation from households in Hoi An City, a famous tourist city in Viet Nam. Waste sampling, followed by a questionnaire survey, was carried out to gather data. The Bayesian model average method was used to identify factors significantly associated with waste generation. Multivariate linear regression analysis was then applied to evaluate th...

متن کامل

Investigation of Different Validation Parameters of Micro Gas Turbine for Range Extender Electric Truck

Nowadays the demand for reducing pollutant emissions and fuel consumption have paved the way of developing more fuel-efficient power generation system for transportation sector. Micro gas turbine (MGT) system can be an alternative to internal combustion reciprocating engine due to its light-weight and less fuel consumption. In this paper, some major running and operating characteristics of MGT ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006